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ABSTRACT 


This thesis examines two problems in on-board computing for space vehicles and 
develops rules for applying Reduced Precision Redundancy (RPR) as a new method of 
fault tolerance in Field Programmable Gate Arrays against Single Event Effects due to 
radiation on orbit. RPR was discovered by Snodgrass in 2006 and was first demonstrated 
using the single-input CORDIC algorithm. This research applies RPR to elementary 
multiple-input arithmetic operations (addition, subtraction, multiplication, division) and 
extends applications to multi-level combinations of these operations as they appear in 
spacecraft subsystems, specifically communication and attitude determination and 
control. Eurther modeling and simulation work explores the impact of varying levels of 
reduction in precision on the performance of communication and control systems using 
RPR. Einally, a higher-fidelity dynamics model and control system are developed for the 
NPS Bifocal Relay Mirror Spacecraft simulator, and potential application points for 
selective redundancy using RPR are identified. 
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EXECUTIVE SUMMARY 


The harsh radiation environment of spaee generates faults in FPGAs that affect 
both data and configuration memory. The Configurable Fault Tolerant Processor at the 
Naval Postgraduate School is a platform for testing methods of fault tolerance that guard 
against the single-event effects of radiation in FPGAs. In 2006 Snodgrass introduced a 
new method of fault tolerance, Reduced Precision Redundancy (RPR), as a power-saving 
alternative to traditional Triple Modular Redundancy (TMR). This research focuses on 
the details of implementing RPR and the effect of RPR fault tolerance on the 
performance of spacecraft systems. 

Two categories of system architectures are discussed: recursive data management, 
found in feedback control systems; and flow-through data management, found in signal 
processing tools such as the fast Fourier transform. Examples of the two architectures are 
broken down into their elementary operations, and the common operations are chosen as 
the subjects of experiments in RPR implementation. The “degree of RPR” is defined as a 
measure of reduction in precision. Detailed RPR designs for addition/sub traction and 
multiplication are programmed, simulated and mapped to the Virtex™ XQVR600 FPGA 
using the Xilinx Integrated Software Environment. Versions of each operation are built 
in TMR and several degrees of RPR, and the EPGA resources required for each degree of 
RPR are compared to the resources used by the corresponding TMR experiments. The 
results obtained from the detailed designs are extrapolated to estimate the resources 
required to implement RPR division and the compound operations of matrix 
multiplication and the fast Eourier transform butterfly machine. 

An evaluation of RPR-protected system performance is conducted on models of 
recursive and flow-through data architecture systems using MATEAB and Simulink 
computational tools. Transient and persistent errors are modeled as delta and step 
functions of additive noise in the signal data flow, and RPR error correction is modeled 
as an increase in signal-to-noise ratio whose magnitude depends on the degree of RPR. 
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The improvement in system response and reduction in output error between “no fault 
tolerance” and “RPR fault tolerance” are measured to determine the impact of RPR on 
system performance. 

One example system is also studied further, in order to improve it and assess it as 
a complicated candidate system for RPR. A new dynamics model is developed for the 
Bifocal Relay Mirror Satellite Simulator test bed at NPS. It describes the effects of three 
flexible appendages, which expands the model to a twelve-state system with limited 
observable output. A linear-quadratic Gaussian controller is developed for the flexible 
structure model by combining a Kalman filter state estimator with a cost-optimal linear 
quadratic regulator (LQR). The new dynamics and control system is tested using a 
reference maneuver, and control cost is compared to the cost of using a simple PD 
controller. Finally, the enhanced BRMSS model is examined as a candidate system for 
RPR, and operations suitable for applying RPR are identified. 

Through the research presented in this thesis, it is shown that RPR can be 
implemented for arithmetic operations using standard rules for upper and lower bound 
determination. It is shown that the FPGA area and power savings gained by using RPR 
instead of TMR increase with the complexity of the module being protected, and that the 
complexity of an RPR voter compared to a TMR voter makes RPR application a more 
efficient choice for large multi-part operations. It is also demonstrated that transient 
errors are less damaging than persistent errors to systems of recursive and flow-through 
data architectures. An example system simulation shows that RPR of a degree less than 
16/52 provides satisfactory performance in the presence of either transient or persistent 
errors. It is concluded that the trade space defined by FPGA capacity, speed of operation, 
and error tolerance requirements must be examined by a system developer in order to 
determine the optimal level and degree at which to apply RPR fault tolerance. 
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I. INTRODUCTION 


As the electronics of space systems have evolved since the 1960s from masses of 
analog and digital circuits to principally digital systems, the responsibility placed on 
computers operating in the harsh radiation environment of space has greatly increased. 
Most modern spacecraft have passive redundancy in the computer systems that conduct 
centralized control of the satellite and run communications tasks, attitude control tasks, 
and often on-board payload data processing tasks. The combination of the inaccessibility 
of most spacecraft after launch and the mass and power restrictions imposed on 
spacebound payloads generates a strong argument for choosing reconfigurable digital 
hardware for some applications in space vehicle computers. Specifically, field 
programmable gate arrays (FPGA) offer a high degree of flexibility for supporting 
multiple applications through reprogramming as well as the speed necessary to operate 
complex architectures in real time [1]. 

The challenges faced by the developer of an FPGA-based system in space are not 
trivial. In addition to the damage of long-term radiation exposure, which can be 
minimized using standard electronics shielding techniques [2], FPGAs are susceptible to 
errors in both data and architecture configuration caused by single event effects (SEE). 
Over the past decade, the computer engineering group at the Naval Postgraduate School 
has been modeling SEE, developing fault tolerance methods for spaceborne 
reprogrammable computers, and testing them using the EPGA-based Configurable Eault 
Tolerant Processor experiment. In 2006, a new method of fault tolerance was discovered 
called Reduced Precision Redundancy (RPR) that offers a trade between precision in 
calculation and power savings on an EPGA. RPR can potentially reduce power 
consumption in an EPGA up to 70% over traditional redundant fault tolerance techniques 
[3]. However, the concept of RPR is in its infancy - the method still needs to be 
evaluated as applied to operations commonly found in computers on space vehicles. 
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A. OBJECTIVE 

The objective of this thesis is twofold: to define further the implementation of 
fault tolerance using RPR, and to investigate the effect of using RPR on some major 
space vehicle processing problems. Algorithms for many common tasks on spacecraft 
can be broken down into a set of elementary operations; this thesis provides rules and 
considerations for applying RPR to several elementary operations as well as a model for 
assessing the impact of error propagation through a system to which RPR has been 
applied. 

B, BACKGROUND 

1. Flexible Computing in Space Applications 

The recent advent of the Operationally Responsive Space (ORS) concept and 
Department of Defense (DOD) office of the same name have brought great publicity to 
the concept of using reprogrammable and otherwise flexible computers and processors in 
satellite systems. The benefits of having a reprogrammable asset in space are many: 
transceivers may be updated with new communications or compression algorithms; 
satellite control may be saved in the event of a primary computer failure (or partial 
failure); entire new mission applications may be uploaded for experimental purposes. 

Reprogrammable hardware options for digital computers include microprocessors, 
field-programmable gate arrays (FPGA), and digital signal processor (DSP) chips. Each 
type of hardware is optimally suited to provide a different combination of function, 
flexibility and processing speed. FPGAs are different from microprocessors and DSPs 
because the device configuration is actually programmable. This gives the FPGA greater 
operating speed for most tasks than a microprocessor, which must load applications only 
in software, and far greater flexibility than most DSPs, which have basic functions 
permanently implemented in hardware for speed. 

In general an FPGA will operate more slowly than a custom-designed application- 
specific integrated circuit (ASIC) that performs the same function, but the benefits of the 
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after-market programmability of an FPGA often make it a more attraetive ehoiee. 
Speeifieally, an FPGA requires much less development time than an ASIC, enables post¬ 
production re-programming to fix bugs or problems discovered after implementation, and 
incurs lower design and development (non-recurring) costs. 

The ability to change or reprogram a system remotely after it has been activated is 
of particular interest to the space community, where the only unmanned satellite ever 
visited for repeated post-launch repairs has been the Hubble Space Telescope. The 
potential insertion points for new or updated algorithms, missions, and applications are 
numerous - but along with the benefits of reprogrammable computers in space come the 
limitations and special considerations associated with their unique operating 
environment. 

2. Reprogrammable Computers in the Space Environment 

It is a well-known fact in the satellite development and operation community that 
the space environment is unforgiving of its man-made trespassers [4]. Space vehicles 
experience erosion, thermal imbalance and other surface damage due to impact from 
micrometeorites traveling at relative velocities up to 8 km/s (in low-earth orbit). As orbit 
altitude increases, the percentage of ionized molecules in the particles surrounding the 
spacecraft also increases. This “plasma environmenf ’ causes charge to build up both on 
the surface of and deep inside spacecraft; the charge accumulation can overcome electric 
fields or trigger severely damaging arcing across the vehicle. In addition to general 
charging from the plasma environment, individual heavy ions from the Van Allen belts 
of the earth, solar events or cosmic radiation cause Single Event Effects (SEE) when they 
interact with a space vehicle. SEE include permanent hardware failures such as circuit 
burnout or gate latch-up as well as temporary failures due to single-event upsets (SEU). 
SEU occur when a highly energetic (MeV or higher) charged particle strikes one or more 
memory bits in a spacecraft computer, changing the bit’s energy level and thus also 
changing the value stored in that memory location. This is called a fault. Depending on 
the location and importance of the affected bit, a fault can cause an entire processor to 
fail temporarily until it can be stopped, its memory cleared, and restarted. 
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The special quality of an FPGA relative to other electronic devices or processors 
is that an FPGA can be reconfigured, even after being launched in a spacecraft. RAM- 
based FPGAs are particularly flexible, since their entire configuration is set in memory. 
However, this creates a unique vulnerability in RAM-based FPGAs: both the application 
data and the configuration of the device - how it performs its intended mission - are 
susceptible to SEUs. When a fault occurs, it may not always cause a complete failure of 
the affected FPGA; it may merely change part of the FPGA function such that it 
continually generates errors in the output data. In this case, the fault will go undetected 
unless the configuration of the FPGA is checked periodically for errors. 

There are currently several versions of radiation-hardened processors and other 
components available from commercial electronics developers [5]. All are built to 
withstand the total radiation dose levels and accumulation of charge experienced over the 
lifetime of the satellite. However, even radiation-hardened version of FPGAs currently 
available do not inherently guard against SEU. Instead, the algorithms implemented on 
the EPGAs must have fault tolerance designed into them, using one of several possible 
approaches. 


3, Fault Tolerance Methods 

The two most common approaches taken by designers to minimize the impact of 
single-event radiation effects on EPGAs are error correction coding (ECC) and 
redundancy - specifically Triple Modular Redundancy (TMR). A new variant on TMR, 
Reduced Precision Redundancy (RPR), is explored in this research as another viable fault 
tolerance approach. 


a. Error Correction Coding 

Error correction coding is applied to binary data in a communication or 
computer system to protect the integrity of the bitstream as it is moved over some spatial 
distance or stored for some length of time. Depending on the requirements of a system, 
error correction codes may correct errors automatically or merely detect them in order to 

alert an operator that errors have occurred. Practical error correction codes, such as the 
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Hamming code, divide data into sections and append some number of check or parity bits 
to each section. The value of the cheek bits is determined by the pattern of the data bits 
[6]. The patterns of data and cheek bits are then deeoded at the communications receiver 
or before the next operation in the computer. The reliability of data transmitted or stored 
using ECC generally increases greatly, quantified as a ending gain of up to 9 dB [7]. For 
a comprehensive treatment of several common ECC code families including Golay, 
Reed-Muller, Reed-Solomon, Viterbi, and trellis-coded modulation, see [7]. 

b. Triple Modular Redundancy 

Triple modular redundancy (TMR) is a fault-tolerance approach that uses 
parallel eomputation and voting to detect and correct errors in a circuit. The basic 
structure of TMR is made up of three identical copies of an operation and the two-of- 
three majority voter eonstruct seen in Figure 1. The voter is applied to each bit of the 
output of the three eircuits. The bitwise majority voter will mask (correct) any single 
error in the three operation cireuits, and it will flag (detect) any error in the voter itself. 



Figure 1, Bitwise Majority Voter With Single Error Correction and Voter Error 

Detection (after [8]), 
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TMR is generally applied at the lowest level in a cireuit where it ean still 
effectively correct errors [8]. It provides highly reliable error correction for single errors, 
but the cost of that reliability is a circuit that requires over three times the programmable 
logic space on an FPGA as that required by the unprotected circuit. This thesis addresses 
the theory that Reduced Precision Redundancy may alleviate the limitations imposed by 
the size of a TMR circuit on an FPGA. 

c. Reduced Precision Redundancy 

The concept of Reduced Precision Redundancy (RPR) allows the sacrifice 
of some level of precision in calculation, in the event that an error occurs, in return for 
space and power savings on an FPGA. The theory as developed by Snodgrass [3] at the 
Naval Postgraduate School (NPS) in 2006 suggests that instead of generating three 
identical copies of a circuit and voting on the outcome as with TMR, some functions lend 
themselves to operating in a single thread at full precision, and in two additional threads 
at reduced precision. The two reduced-precision operations generate an upper and lower 
bound on the correct function output. The precise calculation result is then compared to 
the upper and lower bounds, and voting logic determines whether the precise result may 
be used, or if an error has occurred in the precise solution and the average of the bounds 
must be used as a less-precise result. 

Snodgrass demonstrated the concept of RPR in a Coordinate Rotation 
Digital Computer (CORDIC) algorithm built for the Xilinx Virtex™ XQVR600 devices 
on the Configurable Fault Tolerant Processor experiment platform at NPS. 

4, The Configurable Fault Tolerant Processor 

The Configurable Fault Tolerant Processor (CFTP) is a research experiment in the 
NPS Space Systems Academic Group. It is a Xilinx Virtex™ FPGA-based platform for 
developing and testing new fault tolerance methods on a spacecraft computer system in 
both the laboratory and the space environment. On the ground, CFTP has been subjected 
to controlled radiation experiments using the cyclotron at the Crocker Nuclear 
Laboratory, University of Califomia-Davis, CA [9]. Detailed descriptions of the CFTP 
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architecture are available in [10] and [11]; instructions and examples of CFTP 
experiments can be found in [12], 

The fault tolerance method currently used in most CFTP architecture and 
experiments is TMR [11], [12], However, future generations of CFTP will incorporate 
more complicated experiments that, when implemented using TMR, will be limited by 
the capacity of the FPGA. Implementing new experiments using RPR is one method of 
alleviating the space limitation of the FPGA - but in order to do this, RPR must be 
defined and its effects understood completely for a wide range of possible applications. 
That definition and understanding is the intended goal of this thesis. 

C. ORGANIZATION OF THIS THESIS 

Chapter II describes the conditions necessary to apply RPR successfully to a 
problem, and presents two examples of systems that contain processes meeting the 
necessary criteria. The first example system type, spacecraft attitude determination and 
control, operates on a matrix of states in a feedback loop. The second example system, 
software-defined radio, processes data in a flow-through manner. Chapter II concludes 
with a list of elementary operations common to many processes that have been collected 
from the example systems and also meet the criteria for good RPR candidates. 

Chapter III discusses the common elementary operations in detail and describes 
how to apply RPR to each one. Included in each discussion are rules governing upper 
and lower bound selection for the operation and an example implementation in FPGA 
schematic design. The RPR implementations are compared to corresponding TMR 
implementations to determine the space and power savings of RPR. 

Chapter IV addresses the impact of errors due to single-event effects on the 
performance of a system using RPR. Performance is evaluated using a set of metrics 
based on modeling RPR contingency operations as spikes or increases in the system noise 
level. Both control and soft radio systems are examined via modeling in Simulink®. 

Chapter V explores a more complicated practical scenario in spacecraft attitude 
determination and control: the NPS Bifocal Relay Mirror Satellite (BRMS) Simulator 
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(BRMSS). The BRMSS dynamics model is improved by adding the effects due to 
flexible appendages on the BRMSS structure. A more accurate linear-quadratic-Gaussian 
(LQG) controller, incorporating uncertainty in modeling and measurement due to 
unobservable states and the effects of additive white Gaussian noise, is developed as an 
alternative to the current proportional-derivative (PD) controller. Chapter V concludes 
by demonstrating connections between the more sophisticated BRMSS control system 
and the elementary operations described in Chapter III, as well as the performance 
evaluation methods proposed in Chapter IV. 

Chapter VI contains a summary, conclusion, and recommendations for future 

work. 
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II. PROBLEM DISCUSSION 


A, RECOGNIZING A SUITABLE OPERATION 

In developing RPR, Snodgrass divides problems into two types: Class A, suitable 
for RPR implementation; and Class B, not suitable for RPR implementation. The most 
signifieant distinction between Class A and Class B problems is in the organization and 
representation of data [3]. 

Class A problems generally manipulate numbers or other data represented by 
blocks of bits ordered in increasing or decreasing importance. In a Class A problem, it is 
possible to repeat or duplicate processes using only a subset of the data bits - the most 
important bits - in order to produce a reduced-precision redundant result. Fixed-point 
numerical problems are a good example of Class A problems, as they have clearly- 
defined most significant bits (MSB) and least significant bits (LSB) for every item of 
data. 

Class B problems may represent data using single- or multi-valued (Boolean) 
logic functions where the importance of each data bit is the same. The output of such 
functions cannot be generated using fewer bits without greatly changing the meaning of 
the result. Problems also may be categorized as Class B when any less precise 
representation of the problem is significantly more complicated than the full-precision 
operation, as this would eliminate the benefits of RPR (less space and power required for 
reduced-precision calculations). Furthermore, some problems have no solution algorithm 
that can be executed in multiple ways; the full-precision method may be the only possible 
method. 

This study examines two “real world” applications of reprogrammable computers 
in space vehicles from the perspective of implementing RPR for each application. The 
two applications are attitude determination and control, an essential subsystem for most 
spacecraft; and signal processing, part of the spacecraft’s communications subsystem or 
payload. 
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B. 


SPACECRAFT ATTITUDE DETERMINATION AND CONTROL 


1, Purpose and Requirements of a Spacecraft Attitude Determination 
and Control System (ADCS) 

Almost any modem space-based mission requires knowledge and control of the 
spacecraft’s orientation - usually to point an antenna or telescope for transmitting or 
receiving information [13]. Even if the primary mission of a satellite does not require 
precise pointing, communication between the space vehicle and its ground control 
segment often requires some basic attitude control of the vehicle in order to compensate 
for various disturbance torques that affect spacecraft in orbit [14], Typical requirements 
levied on the ADCS of a space vehicle include accuracy and range for both attitude 
knowledge and attitude control, as well as jitter, drift and settling-time constraints on the 
control system [14]. A diagram of a notional rigid body with torque components and 
attitude angles is shown in Figure 2. 



Figure 2, Definition of Attitude Angles and Torque Components in Spacecraft 

Reference Frame, 

During its operational life, a space vehicle may undergo many changes - either in 
operational mission or in physical form or function. Although a space vehicle is designed 
with parts made to last beyond its mission lifetime, on some occasions events during 
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launch or on orbit may cause certain parts to break partially or eompletely. In many 
oases, satellites are launohed to fulfill a oertain mission need, but after some period of 
operation they are re-tasked to acoomplish a different mission instead of or in addition to 
their original purpose. In cases like these, it is imperative that the spaoe vehicle 
subsystems be able to adapt to the physioal or operational changes to the vehiole. 
Flexible oomputers for subsystems suoh as the ADCS are advantageous in these 
situations, beoause they allow ohanges in not only eommanded trajeetory (a standard 
input), but also stored mass and momentum properties of the vehiele, the importanee of 
eaeh sensor, or the eapability of eaeh aetuator. Furthermore, if the ADCS of a satellite 
has been implemented on an FPGA, it ean be reprogrammed to take advantage of new 
and more effieient eontrol teehniques developed after is has been launehed. In addition, 
if the disturbanee environment for a given satellite is more aeeurately modeled over time, 
the eontrol algorithm or entire ADCS can be reprogrammed to utilize the knowledge 
gained from the new models. 

A spaee vehiele is generally subjeet to two types of external disturbanee torques 
in its orbital environment: eyelie disturbanees, whieh vary periodieally as the spaeeeraft 
travels around its orbit, and seeular disturbanees, whieh are eontinuously additive and do 
not caneel themselves out over the eourse of an orbit [14]. One example of cyelic torque 
on an earth-oriented vehiele is solar radiation pressure, whose direction is always radially 
out from the sun (and therefore constant within any given revolution of an earth-orbiting 
spaeeeraft). An example of a eonstant (secular) torque on an earth-oriented vehiele is 
aerodynamie drag due to the earth’s upper atmosphere. This is partieularly notable in 
LEO satellites. In addition to eyelie or secular external torques, some disturbance torques 
on a spaeeeraft are internal, e.g., liquid sloshing in fuel tanks [15]. 

It is important to distinguish between the eoneept of torque, whieh aets on only 
the attitude and orientation of a vehicle, as opposed to a foree, whieh eontrols the position 
of the center of mass of a vehiole with respeot to an external reference frame. This 
research examines only the part of a spaeeeraft ADCS that eontrols the torques on the 
orientation of a spaeeeraft through the storage and use of angular momentum, and does 
not look at the kinematios of the entire vehiole’s trajeetory through spaoe. 
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The effects of any disturbance torques on the dynamics of a space vehicle can be 
modeled mathematically, as in [15], [16], [17]. These dynamic models are in turn used to 
build attitude determination and feedback control system models for space vehicles, 
which often run autonomously on either a dedicated ADCS processor or the main space 
vehicle computer. Each operation in the ADCS processor must be executed at some 
minimum level of precision in order to maintain the tolerances required by a given 
mission for pointing accuracy, jitter control, drift, and settling time. 

2. General ADCS Overview 

A spacecraft ADCS is typically a feedback control system with two basic 
components: attitude knowledge or determination, and control. Attitude determination is 
achieved by processing data from sensors (e.g. earth or sun sensors, magnetometers) and 
control is executed using actuators. Actuators can be either passive (e.g., gravity booms, 
magnetic systems) or active (e.g., reaction wheels, momentum wheels, control moment 
gyroscopes (CMC)). 

Regardless of the type of sensors or actuators used, the ADCS processor must 
manipulate the measurements taken by the sensors, update the state matrix describing the 
orientation and rate of the vehicle, compute the necessary control to maintain or change 
the state of the vehicle, and allocate the prescribed control among the actuators. In most 
space vehicles built today, the ADCS processor is a digital computer that samples sensor 
values at discrete intervals determined by a system clock. Along with the external and 
internal disturbance torques on the satellite, additional error and uncertainty is introduced 
into the control system as noise from sampling and quantizing the attitude determination 
sensor measurements. A high-level block diagram of a feedback control system is shown 
in Figure 3, where the Vehicle Dynamics block represents the system being controlled 
(the plant) and the sensor measurements, control algorithm, and torque generation are all 
parts of the controller. 
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Figure 3, A Typical Attitude Control System (From [14]), 


Within the Control Computer block in Figure 3, there are separate processes for 
estimating the vehicle state based on sensor measurements, running the control algorithm 
(which operates on the difference between the estimated states and a commanded state), 
and allocating the resulting control torque appropriately among actuators [18]. This is 
depicted in Figure 4, with the functions of the control computer enclosed within the 
dotted line. 
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Figure 4, Schematic of Fundamental Control Problems (From [18]), 


This research examines only the control algorithm portion of the ADCS as a 
candidate for RPR. The estimation problem associated with sensing attitude and the 
allocation problem associated with operating actuators are beyond the scope of the RPR 
problem at this time. 

3, Example Control Algorithm: Proportional-Derivative Control 

A basic control algorithm consists of comparing the current state of a system 
(position and rate) to its desired state, and calculating the correctional force or torque 
necessary to reduce or eliminate the difference between the actual and desired states. A 
simple mathematical representation of spacecraft dynamics and control uses linear 
ordinary differential equations (ODE) to describe the relationships among angular 
position, rate, and acceleration, and applied torque (from disturbances or the controller) 
in the roll, pitch, and yaw directions (refer to Figure 2). It is most straightforward to 
consider the case of pitch-only dynamics and control, as the pitch motion of an earth¬ 
orbiting spacecraft can often be decoupled from the roll and yaw motion for small- 
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disturbance cases. Simple proportional-derivative (PD) control of the pitch motion for a 
three-axis-stabilized spacecraft can be modeled using a linear ordinary differential 
equation of motion (EOM): 

T^=j^e+K,e+K^e (i) 

where is the disturbance torque, J^, is the principle moment of inertia of the 
spacecraft about the pitch axis, and and are the gains associated with the PD 
controller. In the simplest case, the and gains are constants that are initially 

chosen using a control design approach (e.g., the root locus method) based on the 
characteristics of the system. The goals of the control system are to achieve stability, and 
to minimize steady-state error and system settling time. 

The pitch control function is nominally implemented in a model of an ideal three- 
axis-stabilized spacecraft. The model includes the effect of environmental disturbance 
torques, and takes into account the coordinate transformations necessary to calculate the 
dynamics of the system (Figure 5). The PD controller itself contains only the simple 
multiplication and addition operations on the constant gains and state variables in 
Equation (1). However, if the control is extended to address disturbances in all three 
directions (roll, pitch and yaw) as well as coupled relationships among the angles, the 
operations become two-dimensional matrix multiplications. Other operations in the 
control model include storing and repeatedly accessing several constants or sets of 
constants (vectors of fixed or floating-point numbers), reshaping and selecting or 
multiplexing matrix elements, multiplying and dividing direction cosines to extract 
position angles, and calculating trigonometric functions. 
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Figure 5, PD controller in ideal three-axis-stabilized spacecraft ADCS (After [19]), 


Although most of the blocks in Figure 5 model the dynamics and kinematics of a 
spacecrah and would not be present in the ADCS processor on the actual platform, the 
Extract Position Angles function is one example of additional manipulations that need to 
be performed in the control computer. The detailed operation of this function is shown in 
Figure 6. Within this subsystem are both elementary and more complicated operations. 



Figure 6, Extract Position Angles function in ADCS (After [19]), 
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Elementary operations include multiplication, division, signal limiting (ceiling or 
floor function based on specified constant bounds), and selecting and multiplexing 
signals. More complicated operations include the inverse trigonometric functions used to 
convert the direction cosine data into Euler angles of spacecraft position. 

Although the control algorithm in the ADCS processor is essential on-orbit 
processing for most space vehicles, it is not the only computing application that must 
function correctly and reliably on board a satellite. Another subsystem that is necessary 
for every spacecraft to carry out its mission is the communication system. 

C. SPACECRAFT SIGNAL PROCESSING: SOFTWARE-DEFINED RADIOS 

1, Purpose and Requirements of a Spacecraft Signal Processor 

One of the most computationally intensive parts of a spacecraft communication 
subsystem is the signal processor. Almost all modern spacecraft use digital computers, 
which operate on data that have been extracted from samples of a signal received by the 
communications antenna. Between the baseband information processing in the main 
spacecraft computer and the RE signal transmitted from or received by the antenna is a 
series of signal processing operations that can be implemented in hardware or in some 
combination of hardware and software. Reed [20] deflnes a software radio as “a radio 
that is substantially defined in software and whose physical layer behavior can be 
significantly altered through changes to its software.” Many software radio designers use 
SRAM-based EPGAs for digital data processing because they can handle more complex 
circuits than programmable digital signal processing (DSP) chips, and can execute signal 
processing operations faster than standard microprocessors. This is particularly evident 
when designers take advantage of parallel processing architectures allowed by EPGA 
logic block configurations. EPGAs also offer great flexibility because they can be 
reprogrammed to handle different architectures or algorithms, even “on the fly” during 
system operation. 

A very important requirement for a software radio processor is computation 
speed; it must be able to handle data at the sample rate necessary to receive and/or 
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transmit all required information. In addition to speed, a software radio must be able to 
handle some level of eomplexity in its eircuits, for algorithm implementation. The 
greatest environmental requirements on a spaee vehiele for any part or subsystem (as 
opposed to functional requirements) are limitations on mass, power consumption, and 
heat dissipation - all due to on-board resource availability. All these must be taken into 
account when deciding how to allocate functions to hardware and software in a 
spacebome software-defined radio. 

2. General SDR Overview 

In general, a radio is meant to process streams of signals and data. Signals are 
received through an antenna, decoded, and presented to an observer or listener through an 
output. The data can also be internally generated or input through a device, coded, and 
transmitted as an RF signal through the antenna. In either direction, data is rarely saved 
or retransmitted, except in the case of error-checking protocols such as automatic repeat 
request (ARQ). Compared to the ADCS, which has a set of system states and state errors 
that are continually updated using a feedback loop, a radio can be considered a “flow- 
through” process. Data may be processed in a continuous stream or broken into batches, 
but either way data are brought in, manipulated, and put out in a minimal number of 
clock cycles of the system. Within the communications system family, the characteristics 
distinguishing a software radio from a traditional radio depend on how much of the 
radio’s functionality can be changed without replacing hardware, and how flexible the 
radio is overall. 

Any software radio is identified by the presence of flexibility throughout its 
architecture. A generic block diagram of a software radio is shown in Figure 7. The 
“smart antenna” and RF hardware are ideally able to cover multiple spectrum bands 
within a broad range. The analog-to-digital converter (ADC) and digital-to-analog 
converter (DAC) may have multiple settings for fidelity of information (e.g., 8-, 12-, or 
16-bit converters). The processor(s) may implement communications algorithms in some 
combination of software and hardware, depending on requirements for speed and 
reprogrammability. 
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Figure 7. Model of a Software Radio (From [20]). 


Although software radios can use any combination of application-specific 
integrated circuits (ASICs), DSPs, or FPGAs as processing hardware, this research 
focuses on implementing software radio processes on RAM-based FPGAs (such as those 
on CFTP) and how to make those processes fault-tolerant using RPR. Within the 
“processing” block, there are functions that break down or divide the signal into batches, 
decode the signal, reconstruct information from decoded data, communicate with other 
processes, or report to a higher level processor or to an overall operating system. This 
exploration focuses on one of the most common operations found in the signal processing 
block of any radio: the Discrete Fourier Transform (DFT), and as implemented in digital 
computers, the Fast Fourier Transform (FFT). 

3. Example SDR Function: Fast Fourier Transform (FFT) 

The operation used to translate a sampled signal represented by a set of N data 
points in the time domain to a second set of N points in the frequency domain is the 
Discrete Fourier Transform (DFT), given by 

X[k] = DFT{x[n]} = ^x[n]A^'"'”/'^, fork = 0 ,..., A-1 (2) 

n=0 
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where x\h\ are the time-domain input samples, X\k\ are the frequency-domain output 

samples, and j = yf-A . The complex exponential can also be represented as the phase 
factor Wj ^, as in 


iV-l 



where = e and the subscript N indicates that there is a unique phase factor for 


any FFT of size N. When the DFT is implemented in digital computers to process sets of 
data where is a power of 2, the transform operation can be continually broken down 
into the sum of two half-size transforms, as in 



X[k]= Y, ^[2«]</2+</2 Z ^[2« + l]wj2 


( 4 ) 


n=Q n=Q 


until the lowest level is reached, where Equation (4) simplifies to the addition of two 
input elements with a coefficient • When N = 2, the coefficient 

so the fundamental operation for each pair of output points becomes the addition and 
subtraction of two corresponding input points - known as a butterfly operation or 
butterfly machine (BFM). A graphical representation of this operation, where the 
crisscross “winged” shape is evident, is shown in Figure 8. 


a 


a + w^jb 


b 



a - w yb 


Figure 8, Graphical representation of FFT BFM, 
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Computing a DFT as a series of butterfly operations is known as the Radix-2 Fast 
Fourier Transform (FFT). The eomplexity of the FFT increases as A^logj (A^), while the 

complexity of the DFT operation grows as . Additional details on the FFT, its 
complexity relative to that of the DFT, and this derivation are available in [21], 

One BFM is used to compute a two-point FFT {N = 2). In a two-point FFT, the 
phase factor or twiddle factor used to multiply the second operand {b in Figure 8) 

prior to the addition and subtraction is equal to 1. To compute FFTs of practical lengths 
(e.g., A = 8 or higher), multiple BFMs are implemented in levels and pipelined. The 
number of levels of BFMs in an FFT is / = [logj A], with each stage containing A/2 

radix-2 butterfly operators. The only differences from one butterfly operation to the next, 
or among levels of butterflies, are; the memory locations of the input, memory locations 
of the output, and the factors . The nets (connections) between levels or individual 

operations are mapped differently depending on the algorithm of the FFT; the algorithm 
is chosen based on design implementation (e.g., pipelined or iterative) and also affects the 
order of the input and output points. A diagram of an eight-point DFT calculation using 
an in-place algorithm FFT with three levels of butterfly operations is shown in Figure 9. 
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Figure 9. Flow graph of an Eight-Point DFT using three levels of (From [22]), 


The FFT is among the most eommonly-found funetions implemented in pipelined 
form in SDRs [20]. The pipelining and parallel data proeessing possible in an FPGA 
enables a eomputation speedup proportional to the number of pipeline stages used in the 
implementation. 

The prefabrieated FFT used in a software radio design developed at the Naval 
Postgraduate Sehool in 2008 [23] is the Xilinx Fast Fourier Transform v4.1, whieh 
eomputes the FFT with the Cooley-Tukey algorithm [24]. Wright [23] uses the FFT v4.1 
in pipelined, streaming input/output (I/O) mode, depieted in Figure 10. In this mode, 
whieh is eomputed using radix-2 BFMs with either bit-reversed or naturally-ordered 
addressing, eaeh stage of the FFT has its own memory for storing input and 
(intermediate) output. Both the input data and the phase faetors may be expressed as 
fixed-point numbers with width between 8 and 24 bits, inelusive. 
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Figure 10, Xilinx FFT v4,l Pipelined, Streaming I/O Architecture (From [24]), 


In the Xilinx v4.1 pipelined streaming I/O eonfiguration, eaeh stage may begin 
eomputation when the first pair of results is available from the previous stage. Output is 
available eontinuously after the lateney period of the pipeline. Detailed timing 
information on the streaming I/O mode of the Xilinx FFT v4.1 is available in [24]. 

The fundamental proeesses that make up a butterfly operation are addition, 
subtraetion, and multiplieation. These are somewhat eomplieated by the faet that in any 
FFT with N > 2, the phase faetor is eomplex - therefore most intermediate and final 
results are also eomplex. However, in most FFT implementations (ineluding the Xilinx 
FFT v4.1), the real and imaginary parts of eomplex values are handled separately; 
supplementary multiplieation, addition and subtraetion logie is ineluded to handle the 
reeombination of real, imaginary and eomplex values as needed. The eomplex phase 
faetors are ealeulated and stored in advanee. An additional task for mnning an FFT made 
up of layers of BFMs is the memory indexing required for input and output at eaeh stage. 
It is also worth eonsidering the butterfly as a single operation - and investigating whether 
it, as a eolleetion of more elementary operations, is a suitable eandidate for RPR. 
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D, 


COMMON ELEMENTARY OPERATIONS 


In order to apply RPR to a PD controller or a pipelined FFT, it is necessary to 
determine how to apply RPR to the elementary operations used in these systems. The 
elementary arithmetic operations used in these systems include addition, subtraction, 
multiplication, and division. In principle, each of these operations is a Class A problem; 
results may be calculated at full precision or with less precision. Chapter III investigates 
rules for applying RPR to each function individually, and whether any groups of these 
operations are similar enough for a designer to adhere to a common set of rules for RPR. 

In addition to the arithmetic operations, other fundamental processes are 
necessary to execute either the controller or the FFT. These include memory indexing 
(e.g., “output shuffling” at the end of the FFT), element selection or concatenation from a 
set of stored or transmitted data (e.g., values from a direction cosine), and maintaining 
constants in memory (e.g., FFT phase factors or controller gains). These processes and 
requirements are all more appropriately Class B problems. Justification for these 
decisions is recorded in Error! Reference source not found.. 


Operation 

Key Activity 

Justification for Class B 

FFT output shuffling 

Reassigning memory index 

Requires exact value for unique 
memory location address. 

ADCS matrix 
element selection or 
vector concatenation 

Forwarding a single variable 
or subset of variables from a 
stored or transmitted block 
of data 

From memory: requires exact address. 

From data stream: may be modified to 
handle data represented by fewer bits - 
but will almost always be stored in 
intermediate RAM location regardless. 

Accessing stored 
constants 

Reading data from memory 
locations 

Requires exact value for unique 
memory location address.* 

*Benefit of coding vs. copying stored 
values needs further study - storing 
copies, even with reduced precision, 
requires more memory than 
implementing parity checks. Both 
approaches require additional decision 
logic, for voting or for correction. 


Table 1. Determining Operations Not Suitable for RPR (Justification from [3]). 

24 















Finally, there are two significant compound processes used in these systems; 
matrix multiplication in the controller, and the butterfly operation in the FFT. If either or 
both of these operations can be made fault-tolerant by applying RPR at the compound- 
process level rather than at the elementary operation level, there is potential to save 
additional space and power on an FPGA. 

Whether in an ADCS or an SDR, there are basic arithmetic operations that form 
the heart of the computational processes required of a space vehicle computer. Chapter 
III explores the nature of each suitable operation and provides rules of applying RPR in 
each case. 
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III. REDUCED-PRECISION REDUNDANCY FOR COMMON 

ELEMENTS 


A, ASSUMPTIONS, TERMINOLOGY AND GENERAL RULES 

Any operation protected by RPR has two main processes; (1) executing the 
operation (in both full- and reduced-precision), and (2) error detection, result selection 
and reporting. In some cases the upper and lower bounds of one or more operands must 
be computed prior to beginning the RPR operation (one might call this “process (0)”); in 
other cases the operands are already available in full- and reduced-precision (e.g., if 
generated by the last operation). In this investigation of RPR applications, all arithmetic 
operation sections include the following subsections; an expression of error in the 
computation of the fundamental operation, the approach to determining RPR bounds for 
input (operands) and output (result), and a demonstration either in tabular form or as 
programmed using Xilinx Integrated Simulation Environment (ISE) Release 6.3.03i. The 
Xilinx ISE modules were generated for the Xilinx Virtex™ XQVR600 radiation- 
hardened device, which is the EPGA used on the NPS CETP, using a combination of 
VHDE and schematic entry. Module functionality was tested using the simulation 
environment Models'/m SE version 6.3c. The RPR demonstrations implemented in ISE 
contained both arithmetic operations and voters, and were compared to analogous TMR 
demonstrations using EPGA area occupied as a metric for evaluation. The EPGA area 
was reported during the mapping process in ISE as a slice count, where a slice consists of 
two 4-input EUTs, two D flip-flops, and two carry and control units [25]. 

In order to discuss different methods and effects of applying RPR, it is necessary 
to define a metric that represents the “degree” of RPR - that is, the amount by which 
precision is reduced from the original high-precision calculation to the redundant lower- 
precision calculations. An easily-accessible quantity is the ratio of number of non-sign- 
bits r in the variables of the redundant calculations to the number of non-sign-bits n in the 
variables of the precise calculation. Eor example, if the precise operands and result are 
represented in eight bits of precision and the upper and lower bounds have five bits of 
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precision, then the degree of RPR is rin = 5/8. If the precise calculation is made using 32 
bits of precision and the redundant calculations are made using 16 bits of precision, the 
degree of RPR is rIn = 16/32. This is of course mathematically equivalent to 1/2 (or 
8/16) - however, the value of the denominator is important^, so the fraction is not 
reduced. When comparing different degrees of RPR for a single original calculation of a 
given n, the ratios may be expressed as decimals to indicate the relative precision 
retained. For example, RPR may be applied to a 64-bit number as 32/64, 16/64, 8/64, or 
any other ratio rin. To make the metric easily comparable when considering performance 
of different RPR approaches for this fixed n, the degree of RPR may be expressed as 
0.50, 0.25, 0.125, etc. instead of as fractions. 

In each section of this chapter, an RPR protection approach is introduced. For the 
purposes of illustrating the approaches with experiments, the following conventions are 
used for number representation and digital computation. 

For addition and subtraction, numbers in digital computation are represented in 
fixed-point two’s complement (generally of 8-, 16-, 32-, or 64-bit width), with the radix 
point one place to the right of the MSB (i.e., after the sign bit). This implies 7, 15, 31 or 
63 bits of precision, respectively. In practical applications, inputs a and b would be 
scaled prior to the operation such that they fall within the range -l<(a,6)<l. An 

example of this representation is seen in Figure 11, where the decimal number 
-0.785398jo ( 71/4 to six decimal places) is represented in two’s complement format as a 
fifteen-bit fraction with a leading sign bit. The value of the binary number in Figure 11, 
when converted back to decimal notation, is actually -0.78536...jg, because the smallest 
representable value in fifteen places of fixed-width fractional precision is 
±2-'" =(1/32768) or 0.00003...jg. 


1 Consider the degree of RPR to be similar to the time signature in a musical piece: 4/4 is 
mathematically equivalent to 2/2, but in fact there is a different number of beats per measure in each 
scheme, as well as a different note representing the fundamental beat. The “feel” of the rhythm in 4/4 vs. 
2/2 is quite different. 
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Sign Bit 

MSB LSB 

1 

001101101111001 

> 

Rj 

idix Point 


Figure 11, Sixteen Bit Fixed-Point Two’s Complement Representation of {-nIA). 


For multiplication and division, numbers are also represented as fractional fixed- 
point values, but are in sign-magnitude form (as opposed to two’s complement). The 
additional processing required to multiply two’s complement numbers instead of sign- 
magnitude numbers makes two’s complement a poor choice of representation for 
multiplication. This is explained further section C, on multiplication. Figure 12 shows 
the sign-magnitude format for -0.785398jo (compare to Figure 11). The absolute value 
of the number is represented in the fifteen fractional bits, and the MSB is the sign bit (1 
for negative numbers). 


Sign Bit 

\ MSB LSB 


1 

1 

1 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

1 

1 

1 


T 


Radix Point 


Figure 12, Sixteen Bit Fixed-Point Sign-Magnitude Representation of (-;r/4). 

Additional characteristics, constraints and rules for implementing RPR in each 
operation are expounded in the operation sections. The final section of this chapter 
discusses the logic required to fabricate an RPR voter, because regardless of the method 
required to determine the upper and lower bounds on a result, much of the voting process 
is the same once the bounds are obtained. At the conclusion of the chapter, RPR is 
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viewed from the perspective of error detection and correction in terms of the trades a 
designer must make in order to use RPR effectively on the operations most suitable for its 
application. 

B, ADDITION AND SUBTRACTION 


1. Computation Error in Addition and Subtraction 


Given two fixed-point binary numbers a and b of same precision n with the radix 
point in the same location, there is no significant round-off error due to a single execution 
of the fundamental operation of addition or subtraction [26]. That is, the mathematical 
operation c = a±b is computed accurately to the precision of a and b. The accuracy of 
each binary operand is limited by its precision, that is, if a is a fractional binary number 
with n places after the radix point, it can only be accurate to within of its actual 

value. The error magnitude in any fixed-point fractional number of n places is therefore 

^ . When two numbers are added, the error in each operand may be additive or 

subtractive, i.e., in the same or different directions on a number line (Figure 13). The 
maximum error in the sum occurs when the errors in the operands are additive, and turns 
out to be 


sum “max "max ^ ' (5) 

f <2^" 

sum 

which is equal to the value represented by the LSB of the result. This is shown by the 
potential range c-2s to c + 2s for the result c in Figure 13. Although the largest 
possible is still only the smallest number that may be represented using the chosen 


precision, an error in that LSB can propagate to the next level of computation and 
therefore may become significant in later operations. 
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Figure 13, Number line showing maximum error in one addition operation. 


Since the extreme values c-2s and c + 2s represent the maximum range of the 
result c of an addition or subtraetion operation, these extrema also define the bounds on 
the error in the result c. This eoneept of bounding error in results ean be extended to 
problems of different levels of precision when applying RPR to addition and subtraetion. 

2, Upper and Lower Bound Determination for Addition and Subtraction 

To apply RPR successfully to a given operation, it is necessary to determine the 
reduced-precision upper and lower limits, or bounds, of the result (the output) as 
functions of the operands (the input). Using two’s complement notation for addition and 
subtraction enables identical treatment of the two operations, provided the transitions 
between the negative number closest to zero (1.111... 2 ) and zero (0.000... 2 ) are carried 
out successfully. In this scheme, the upper bound of any operand or result x must lie 

to the right of x on a number line; the lower bound x^ must lie to the left of x, as shown 
in Figure 14. This means that if x is a negative number, the magnitude of x^ is larger 
than the magnitude of x, whose magnitude is in turn larger than the magnitude of x^. 
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ir Negative Numbers 


Positive Numbers 

1.100 1.101 1.110 1.111 

1 III 1 

0.000 

0.001 0.010 0.011 0.100 

1 III 1 

0 

III 1 1 

XlU 


Figure 14, Upper and Lower Bounds of Fixed-Point Two’s Complement Numbers, 


When operands a and b are numbers of precision n and the degree of RPR desired 
is rin, the lower bounds and are defined as the highest (rightmost on number line) 

numbers of precision r such that a,<a and b,<b. The upper bounds and are 
defined as the lowest (leftmost on number line) numbers of precision r such that aKa^ 
and b <by . For example; if n = 5, r = 3 and Xj = O.OOlOlj as in Figure 14, then 


Xj^ = O.OOI 2 < O.OOlOlj and O.OOlOlj < O.OlOj = Xj^^. 


( 6 ) 


Using operand upper and lower bounds as demonstrated in Equation (6), the sum 
or difference of two fixed-point fractional two’s complement operands will always be 
contained within a result range bounded as described in Equation (7). 

For c = a + b and d = a-b, given a^<a<a^j, b^<b<b^-. 

c^<c<Cjj, d^<d <djj, where (7) 

Subtraction can be accomplished in two ways. One way is to create a distinct 
RPR subtraction module that executes only subtraction and calculates the bounds as 
defined for differences d in Equation (7). The other way is to prepend a two’s- 
complementer to an RPR addition module, which calculates bounds as defined for sums c 
in Equation (7), and then to add the two’s complement of the minuend to the subtrahend. 
The two methods generate the same result; the two’s complement/sign change method 
will be used from this point forward because it generates the fewest additional special 


cases. 
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Equation (7) is true regardless of sign and magnitude of a and b, as is shown in 
the following exercise. In a two-input addition or subtraction operation, one may 
characterize the problem using three Boolean variables that express properties of the 
input set; the sign of the first addend (or subtrahend) a, the sign of the second 

addend (or minuend) b, and the magnitude of a compared to the magnitude of b (or 

? 

|a|>|h|). In the simplest case, there are 2^ =8 possible arrangements on a number line 
for the operands and result of the problem c = a±b as described by these properties. 
Special cases, such as where |a| = \b \, will be addressed separately. Table 1 lists the eight 

simple scenarios; Figure 15 and Figure 16 show all eight cases on number lines. In each 
scenario, the sum c or difference d is wholly bounded by the sum or difference of some 
combination of the upper and lower bounds of the operands, as dictated by Equation (7). 


Case 




S{a + b) 

S{a - b) 

1 

0 

0 

1 

0 

0 

2 

0 

0 

0 

0 

1 

3 

0 

1 

1 

0 

0 

4 

0 

1 

0 

1 

0 

5 

1 

0 

1 

1 

1 

6 

1 

0 

0 

0 

1 

7 

1 

1 

1 

1 

1 

8 

1 

1 

0 

1 

0 


Table 1. Eight Combinations of Input Properties for Addition and Subtraction, 


33 












Figure 16, Cases 5 through 8 for c = a + Z> and d = a-b. 
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There are a few additional eonsiderations and special cases worthy of discussion 
when working with addition and subtraction. None invalidates the rules for determining 
bounds; however, each needs to be investigated to confirm its conformity. 

a. Special Cases Where la I = 161 

There are four special cases where the operands are of equal magnitude; 
(1) a = b and a,b > 0; (2) a = b and a,b < 0; (3) a = -b and a > 0; (4) a = -b and a < 0. In 
cases (1) and (2) the operands subtract to give zero; in (3) and (4) they add to give zero. 
Although they cross the boundary between positive and negative numbers, these cases 
follow the same bound convention as those with unequal operands. These are special 
cases for two reasons: first, because they operate in the transition between 1.111... 2 and 
0.000... 2 , and therefore at least one case should be included to test the carry-out/carry-in 
operation to execute that transition. Second, it is notable that the upper and lower of the 
zero results are necessarily of opposite sign; in circuit design and operation this should 
not be a problem, but the designer must be aware of the crossover. 

Finally, it may seem that if an error were to occur in the precise 
calculation in these cases, some operations that should result in true zero would accrue 
some nonzero value at the output. However, this may be avoided with appropriate error 
correction techniques, which are discussed in the section on RPR voter logic. 

b. Special Cases Where Precise and Bound Values Are the Same 

When an operand a ox b contains only r significant digits - i.e., all the 
least significant {n - r) bits are zero - then the condition =a ox b^=b applies. In this 

case, the correct precise results c and d will always fall within a range that is only half the 
maximum error allowed by Equation (7). It is possible to tighten the bounds on the result 
in this case by adding logic to test operands for zeros in the {n - r) LSB: if an operand a 
with only r significant digits is discovered, then may be set equal to (which is 
already equal to the precise value a). This will reduce the maximum error of the results 
{Cfj-c^ and dfj-d^) to half their nominal range, restoring the effective precision of c 
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and dio r bits. This is equally applicable to the subtrahend a or minuend b. However, 
the benefit gained from the relatively unlikely occurrence of this case may not be worth 
the additional logic circuitry required to detect the LSB condition and reset the bounds. 
Candidate systems would need to be examined on a case-by-case basis to determine the 
merit of including this option. 

c. Overflow Cases 

Because the upper bound of the operands is determined by rounding up, 
any full-precision number greater than the largest representable positive number of 
reduced precision r (i.e., a > 1-2'') will cause an overflow condition in its upper bound. 
In two’s complement, the most-negative number is 2, so the analogous overflow 
condition need not occur for lower bounds. However, if a particular application requires 
the number “1.000...” to represent a value other than 2, then there also may be 
overflow in the most-negative bound, i.e., when a < -1 + 2. In either overflow case, the 
designer may choose to use the overflow indicators as control signals to activate an 
alternate RPR result (such as using only the non-overflow bound instead of a 
combination of the two). The alternate RPR result may not be as accurate as the default 
RPR result, but in the case of an SEU-induced error the alternate RPR result is still closer 
to the correct result than the unprotected (erroneous) precise result would be. A second 
option is to scale the operands such that they lie in the range 1 - 2> a > -1 -l- 2before 
determining bounds and beginning the operation, so that the overflow condition on the 
bounds is avoided altogether. 

When the addition operation itself results in overflow in the precise result, 
the overflow flag is activated as it would be in a normal (non-RPR) addition operation. 
In this case, the overflow will also occur in at least one of the upper or lower bound 
results, and may possibly occur in both. This property enables the designer to use a 
bitwise-majority voter (such as is used in TMR) to determine whether overflow occurs in 
the addition operation. 
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d. Special Cases Involving Zero 


When one or more of the operands is identically zero, the operand upper 
bounds determined as in Equation (6) will be nonzero values. Similarly, when an 
operand is between zero and 2(or zero and -2''), the lower (or upper) bound will be 
zero when the value is nonzero. In general, this case has negligible impact in addition 
and subtraction for two reasons; (1) every addition or subtraction operation involving 
zero is defined, and (2) since addition and subtraction are linear operations, adding or 
subtracting zero has almost the same effect as adding or subtracting a quantity very close 
to zero. The complication arises when the result of an addition or subtraction operation 
must be tested for equality to another result (or to its input values), or when a quantity 
that should (or should not) be zero is used in a subsequent multiplication (or division) 
operation. In these cases, the output from the RPR voter that tells whether the final result 
is the (correct) precise result or the RPR result is very important; it may be an important 
input condition for the next operation. 

To provide numerical illustration of some of the special cases. Table 2 
demonstrates a nearly-exhaustive treatment of 4/6 RPR. It lists exact values, lower and 
upper bounds for each representable number in the range (-1, 1). The optional modified 
upper bounds for the special cases when x = x^ are also included. 

The maximum error in any addition or subtraction calculation protected using 
RPR is determined by the precision of the reduced-precision operation. As implied by 
Figure 13, the range between the upper and lower bounds on sum c or difference d is 

[c + 2s)-[c-2s) = 4s, ^<2^^''^'^ (8) 

where r is the precision of the bound operations. However, Equation (5) states that the 
maximum error in a calculation of precision r is 2s. In order to ensure the minimum 
difference 2s between the correct precise result and the RPR result, the RPR result must 
actually be the average of the upper and lower bounds 

= ( 9 ) 
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In addition and subtraction, Equation (9) is also equivalent to adding 2 to the lower 
bound result . 

The manipulation of the bound results required to produee the RPR result is one 
eomponent of an interesting problem coneeming the use of RPR proteetion for small 
circuits. This problem is explored further in the demonstration and voter sections. 


38 



FULL PRECISION (6) 


REDUCED PRECISION (4) 


0.96875 

0.11111 

0.111 

OV 

OV 

0.93750 

0.11110 

0.111 

OV 

OV 

0.90625 

0.11101 

0 . Ill 

OV 

OV 

0.87500 

0.11100 

0.111 

OV 

0.111 

0.84375 

0.11011 

0.110 

0.111 

0.111 

0.81250 

0.11010 

0.110 

0.111 

0.111 

0.78125 

0.11001 

0.110 

0.111 

0.111 

0.75000 

0.11000 

0.110 

0.111 

0.110 

0.71875 

0.10111 

0.101 

0.110 

0.110 

0.68750 

0.10110 

0.101 

0.110 

0.110 

0.65625 

0.10101 

0.101 

0.110 

0.110 


0.62500 0.10100 


0.101 


1.110 0.101 


0.59375 

0.10011 

0.100 

0.101 

0.101 

0.56250 

0.10010 

0.100 

0.101 

0.101 

0.53125 

0.10001 

0.100 

0.101 

0.101 


0.50000 0.10000 


0.100 


1.101 0.101 


0.46875 

0.01111 

0.011 

0.100 

0.100 

0.43750 

0.01110 

0.011 

0.100 

0.100 

0.40625 

0.01101 

0.011 

0.100 

0.100 


0.37500 0.01100 


0.011 


1.100 


0.25000 0.01000 


0.010 


0.12500 0.00100 


0.001 


1.010 


0.00000 0.00000 


0.000 


1.001 


- 0.12500 1.11100 1.111 


1.000 


- 0.25000 1.11000 1.110 


0.011 


0.34375 

0.01011 

0.010 

0 . oil 

0.011 

0.31250 

0.01010 

0.010 

0.011 

0.011 

0.28125 

0.01001 

0.010 

0.011 

0.011 


1.011 0.010 


0.21875 

0.00111 

0.001 

0.010 

0.010 

0.18750 

0.00110 

0.001 

0.010 

0.010 

0.15625 

0.00101 

0.001 

0.010 

0.010 


0.001 


0.09375 

0.00011 

0.000 

0.001 

0.001 

0.06250 

0.00010 

0.000 

0.001 

0.001 

0.03125 

0.00001 

0.000 

0.001 

0.001 


0.000 


- 0.03125 

1.11111 

1.111 

0.000 

0.000 

- 0.06250 

1.11110 

1.111 

0.000 

0.000 

- 0.09375 

1.11101 

1.111 

0.000 

0.000 


1.111 


- 0.15625 

1.11011 

1.110 

1.111 

1.111 

- 0.18750 

1.11010 

1.110 

1.111 

1.111 

- 0.21875 

1.11001 

1.110 

1.111 

1.111 


..111 1.110 


- 0.28125 

1.10111 

1.101 

1.110 

1.110 

- 0.31250 

1.10110 

1.101 

1.110 

1.110 











- 0.71875 

1.01001 

1.010 

1.011 

1.011 

- 0.75000 

1.01000 

1.010 

1.011 

1.010 

- 0.78125 

1.00111 

1.001 

1.010 

1.010 

- 0.81250 

1.00110 

1.001 

1.010 

1.010 

- 0.84375 

1.00101 

1.001 

1.010 

1.010 

- 0.87500 

1.00100 

1.001 

1.010 

1.001 

- 0.90625 

1.00011 

1.000 

1.001 

1.001 

- 0.93750 

1.00010 

1.000 

1.001 

1.001 

- 0.96875 

1.00001 

1.000 

1.001 

1.001 


Table 2. Upper and Lower Bounds for 4/6 RPR Showing CO, OV and Modified xv- 
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3. 


Demonstration of RPR in Addition and Subtraction 


The overall strueture of an RPR operation contains two modules: (1) the high- and 
low-precision copies of the operation itself, and (2) the RPR voter that contains logic to 
select the final result and report any errors. This two-module configuration is shown in 
Figure 17. The incoming clock signal is necessary in order to synchronize the operations, 
particularly when the degree of RPR is small (significant reduction), because in that case 
the reduced-precision calculations have significantly less delay than the full-precision 
calculation. The difference in delay can cause errors if the output availability is not 
synchronized. There are six functional inputs to an RPR adder: the two operands a and b 
(of precision n), and each operand’s upper and lower bounds (of lower precision r). 
There is also a clock signal input. There is at least one output from an RPR adder: the 
chosen result. In Figure 17 there are many possible outputs shown; these are discussed in 
greater detail in the section on voter logic. The remaining signals in the top-level RPR 
module are the intermediate results of the precise, upper bound and lower bound addition 
operations. These intermediate results are fed into the RPR voter, which determines what 
the final result should be. 



Figure 17, RPR Adder Top-Level Block Diagram (OV = overflow). 


The first module, the operation, is shown in Figure 18 for addition only. It 

consists of three adders and clocked registers to synchronize the inputs and outputs. 

There is one full-precision (n-bit) adder and two reduced-precision (r-bit) operators (one 
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for each bound calculation). These three adders are analogous to the three identical 
copies of a full-precision operation that exist in a TMR-protected operation, as depicted 
in Figure 19. The architecture shown in Figure 18 can be used only for addition because 
the incoming bounds are arranged in the configuration that produces 

upper and lower bounds on a sum, not a difference (see Equation (7)). In a subtraction 
operation, the upper and lower bounds of a are paired with the opposite bounds of b (i.e., 
lower and upper, respectively). Flowever, if a designer wanted to use only one RPR 
implementation for both addition and subtraction, he could find the two’s complement of 
the minuend and then use this RPR operation module to add the subtrahend and 
complemented minuend - that is the alternative to building the equivalent subtraction 
operation module. 



Figure 18, RPR Adder - Operation Block Diagram, 
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Figure 19, TMR Adder - Operation Block Diagram (compare to Figure 18), 


The architecture of the second module, the RPR voter, is shown in Figure 20. In 
an adder or subtractor, the “Generate RPR Result” block may be replaced by a block that 
adds to the lower bound, as noted after Equation (9). For comparison, a traditional 
(TMR) voter is depicted in Figure 21. 
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Figure 20, RPR Adder - Voter Block Diagram, 



Figure 21, TMR Adder - Voter Block Diagram (compare to Figure 20), 


The three eomparators in Figure 20 eompare two r-bit numerie values; they are 
not bitwise operations. The eomparators must eompare numerie values beeause there is 
no way to guarantee any individual bits in the upper or lower bound results should be the 
same as in the preeise result. A earry-out generated between the lower bound and preeise 
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result or precise result and upper bound may propagate anywhere from one to r bits 
toward the MSB of the result; therefore there is no way to predict how many bits (zero to 
r) of the bound results and precise results should match. 

As defined in this work, the correct precise sum will always be greater than or 
equal to the sum of the lower bounds of the operands, and it will always be less than the 
sum of the upper bounds. The output of each comparator is a control signal that tells the 
RPR result selection block whether there was an error in the comparison result (e.g., ‘ 1’ if 
the precise result was less than the lower bound). The RPR result selection block checks 
the output from each compare operation, and selects the best final result to report based 
on Table 3. The benefit of comparing the upper bound to the lower bound in addition to 
comparing the precise result to each bound is that if there is an error in one of the precise- 
result comparisons and in the upper-lower bound comparison, then assuming a single¬ 
error scenario, the error is in the bound calculation and the precise solution is in fact 
correct. The RPR selection logic output can also be modified to report the result and a 
single flag, instead of two. The single flag indicates whether the final output is the 
precise result or the RPR result (since both final result options have precision n, they are 
indistinguishable without a separate report). Reporting errors is important both for 
locating areas of the processor that need to be reconfigured due to SEU, and for 
identifying the sources of imprecise calculations that may affect system performance. 


Precise < Lower 

Precise > Upper 

Lower > Upper 

Result Reported 

0 

0 

0 

Precise 

0 

0 

1 

X* 

0 

1 

0 

RPR 

0 

1 

1 

Precise 

1 

0 

0 

RPR 

1 

0 

1 

Precise 

1 

1 

0 

X* 

1 

1 

1 

X* 


Table 3. RPR Result Selection Logic (* indicates multiple error condition). 

The overflow condition check in Figure 20 is a standard bitwise majority voter 
with single-error correction and voter-error detection, commonly used in TMR (see 
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Figure 1). This voter is sufficient for the overflow check because if there is no error in 
the entire RPR operation, then the majority overflow report will be correct. If there is an 
error in the RPR operation (as detected in the comparators), then overflow reporting is 
ignored. Existence of an overflow condition in the sum may be recomputed based on the 
source of the erroneous result calculation - e.g., if the sum lower bound is determined to 
be in error and the overflow checker has only two out of three signals high, the precise 
result most likely has not caused overflow, and may be used without issue. The exact 
nature of the error conditions is revisited in the section on general RPR voter logic. 

4. Comparing RPR and TMR Implementations 

The main advantage of RPR over TMR is that it requires less space on an FPGA 
while still providing the same accuracy of computation in a no-error situation, and 
accuracy within a certain tolerance when errors do occur. This reduction in area means 
that using RPR instead of TMR to protect a circuit requires less power - or, conversely, 
more functionality may be obtained using an FPGA of the same area operating with the 
same amount of power. Snodgrass showed in [3] that an implementation of the CORDIC 
algorithm protected using TMR requires two to three times the power required by the 
same process protected using RPR. The FPGA area required by the components and 
complete circuits for several RPR and TMR implementations of a simple addition 
operation are presented in Table 4. 


Redundancy 

Type 

Precision 
(or Degree) 

Slice Count 

Operation 

Voter 

Complete* 

TMR 

64 

99 

65 

163 

RPR 

32/64 (0.5) 

67 

115 

181 

RPR 

16/64 (0.25) 

51 

95 

146 

RPR 

8/64 (0.125) 

43 

58 

100 

TMR 

32 

51 

33 

83 

RPR 

16/32 (0.5) 

35 

78 

114 

RPR 

8/32 (0.25) 

27 

42 

68 

TMR 

16 

27 

17 

43 

RPR 

8/16 (0.5) 

19 

34 

52 


*Complete circuit area is computed independently of “Operation + Voter” 


Table 4, Area Required By TMR and Representative RPR Addition Experiments. 
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The first significant result of these experiments is that using RPR to protect an 
operation does not guarantee less area (and lower power) than TMR in every case. The 
slice count for the complete implementations of 0.5 RPR at any original precision are 
greater than the slice count for the TMR implementation of the same precision. 
Furthermore, it is evident that this is due entirely to the large amount of logic required for 
the RPR voter versus the logic required for the TMR voter. The RPR addition operation 
module is always smaller than the TMR addition operation module by a significant 
amount. Some calculated relative FPGA area requirements of TMR and RPR for various 
n and r are listed in Table 5. 


Redundancy 

Type 

Precision 
(or Degree) 

Percent of Circuit 
Occupied by 
Voter 

Ratio of 
RPR/TMR 
Operation Size 

Ratio of 
RPR/TMR 
Voter Size 

Ratio of 
RPR/TMR 
Total Size 

TMR 

64 

39.9% 

(1.00) 

(1.00) 

(1.00) 

RPR 

32/64 (0.5) 

63.5% 

0.68 

1.77 

1.11 

RPR 

16/64 (0.25) 

65.1% 

0.52 

1.46 

0.90 

RPR 

8/64 (0.125) 

58.0% 

0.43 

0.89 

0.61 

TMR 

32 

39.8% 

(1.00) 

(1.00) 

(1.00) 

RPR 

16/32 (0.5) 

68.4% 

0.69 

2.36 

1.37 

RPR 

8/32 (0.25) 

61.8% 

0.53 

1.27 

0.82 

TMR 

16 

39.5% 

(1.00) 

(1.00) 

(1.00) 

RPR 

8/16 (0.5) 

65.4% 

0.70 

1.26 

1.21 


Table 5. FPGA Area Comparison for RPR and TMR Adders, 


Table 5 shows that in order for RPR to be a more desirable fault-tolerance 
approach than TMR for a simple operation like addition or subtraction, the degree of 
RPR must be significantly less than 0.5 - and that for both the adder and the voter in an 
RPR addition process to be smaller than the analogous TMR modules, the degree of RPR 
must be less than 0.25. The performance impact of using reduced-precision results in 
systems protected with very small degrees of RPR (e.g., 8/64) are investigated in Chapter 
IV. 
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c. 


MULTIPLICATION 


1, Computation Error in Multiplication 

Multiplication in a digital computer is fundamentally made up of a series of 
addition operations. It is frequently implemented as an aceumulating proeess where a 
computer begins with a register set to zero, then shifts and adds the multiplicand a to the 
register some number of times, where the addition and shift amount are dietated by the 
digits in the multiplier b [27]. Because performing multiplieation is mathematically 
equivalent to performing addition repeatedly, an initial hypothesis may be that the 
magnitude of the total error e aecrued in a multiplication operation depends on the 
number of times addition is performed - i.e., the value of the multiplier. However, 
scaling the operands prior to performing the multiplication minimizes this error 
accumulation. Without knowing the exact values of the multiplier and multiplicand, it is 
still possible to bound the error s on an RPR multiplieation operation based on the 
precisions n and r of the operands, as will be shown in this section. 

Scaling multiplication operands sueh that -l<(a,h)<l guarantees that the 

product c will always have absolute value equal to or smaller than both operands - that is, 
c will be closer to zero on a number line than either of the operands. To understand this 
point, it is helpful to view the entire set of representable numbers in a fractional fixed- 
point binary system as depicted in Figure 22. There is a finite set of representable 
numbers between -1 and 1 (non-inclusive), and the smallest distance between any two 
numbers on this number line is 2 ". The number line in Figure 22 is logarithmic in base 
2 to show that the relative error between any two numbers of precision n is greater when 
those numbers are close to zero. 



Highest representable 

Lowest representable 



negative number 

1.J,I.U.I.ll.!.l .1 .1.1..1....I .. .1. 1. 1 

positive number 

1 

1 1 1 1 iii|iiiiiii| 1 


rill 

1 1 1 k 

K 1 1 1 

-(2°) -(2->) -(2-h 

t 1 [ 1 1 

-(2... -(2-") 0 

1 1 1 i 

T-i ->o 


Figure 22, Number Line for Multiplication in Fractional Fixed-Point, 


47 







Multiplying any two numbers of precision n in a digital computer generates a 
product of precision In. When multiplying binary integers, the product extends toward 
infinity (or negative infinity), increasing the order of the most significant bit (MSB). The 
advantage of scaling operands such that they fall between -1 and 1 is that the result 
always falls within this range as well; the product extends toward zero, decreasing the 
order of the least significant bit (LSB). To obtain a final product with original precision 
n (versus 2n), the exact product p is approximated by adding |x2 " to it and 

throwing away the n LSB (bits n-\ down to 0). The bound on error acquired in fraction 
multiplication using this approximation is, as specified in [26], 

p = {^a*b) + s, (10) 

Multiplication generates computational error of the same maximum magnitude 
regardless of the sign of the operands. In fact, multiplication is mathematically the same 
operation on any combination of negative or positive numbers; the only difference is the 
sign of the product. The sign of the product is determined by the XOR function of the 
signs of the operands; if one operand is negative and one is positive, the product is 
negative; if the signs of the operands are the same, the product is positive. 

Although multiplication is mathematically the same for positive and negative 

numbers, in digital computers the operation is more complicated due to the common 

representation of numbers in two’s complement form. When numbers in two’s 

complement need to be multiplied, they are often transformed to sign-magnitude form 

and multiplied as positive numbers with separate sign determination. Alternatively, 

numbers may be multiplied in two’s complement form - but corrective factors must be 

added to counter the effects of the complement representation. One approach is to 

multiply the two’s complement operands of precision n without the sign bit (MSB), and 

to examine the sign bit separately. If one of the operands is negative, add the two’s 

complement of the other operand, shifted left by n places (i.e., multiplied by 2”). If both 

operands are negative, no change need be applied. Details on this and other more 

complicated methods of correcting multiplication in two’s complement are available in 

[27]. In all cases, the amount of logic required to find the correction factor when one 

operand is negative - some expression of the two’s complement of the positive operand - 

48 



is at least as much if not more logic than that which is required to convert the negative 
operand to sign-magnitude form prior to the multiplication. Therefore, for the purposes 
of this research, it is assumed that multiplication operations receive input in sign- 
magnitude form. The magnitude multiplication is then performed as dictated before 
Equation (10), and the sign of the result is determined using an XOR gate with the 

signs and 5*^ of the operands as input. The magnitude and the sign are checked 
separately for errors in the voter, and the correct sign is prepended to the final result. 

2, Upper and Lower Bound Determination for Multiplication 

The same mathematical convention is used to define upper and lower bounds of 
multiplication operands as is used for addition/sub traction operands; the upper bound is 
the next number to the right on a number line, and the lower bound is the next number to 
the left on a number line (Figure 23). 

<- Negative Numbers Positive Numbers 

(I).IOO (l).Oll (I).OIO (l).OOl (O).OOO (O).OOl (O).OIO (O).Oll (O).IOO 

0 .V;r 


Figure 23, Upper and Lower Bounds of Fixed-Point Sign-Magnitude Numbers, 

The relationship of the products of operands’ upper and lower bounds to the 
precise product is explained in the following paragraph for the cases of positive and 
negative numbers. However, since multiplication of numbers in sign-magnitude 
representation is performed on only the magnitude of the operands, one need only 
consider the positive cases (numbers to the right of zero, inclusive) for implementation 
purposes. 

Given a multiplication problem of precision n, the operand lower bounds 
and upper bounds (a^,bu) are found in the same manner as they are for addition and 
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subtraction. That is, the operand a of precision n will always have RPR lower and upper 
that are the next lower and higher representable numbers of preeision r sueh that 


< a < Qjj, where 

a—aL=SaLi 0 — ^ 2 , 

a^j — a — , 0 < (y < 2 , and 

!£■ -s 

Y^a,L ^a,u\ ^ 


( 11 ) 


As diseussed in the previous section, the nature of multiplying fixed-point 
fractional numbers is such that the product p is always smaller (closer to zero) than either 
of the operands. This property can be applied to the RPR operation; multiplying the 
smaller (lower, if positive) bounds of two operands will always produce a result that is 
smaller than the preeise result. Likewise, the product of the (positive) upper bounds will 
always be larger in magnitude than the preeise result. The relationships are deseribed 
mathematically as 


< p < Py, where 

IF a,b>0: p^ =a^xb^ and p^j =ayX b^ 

IF a>0, h<0: p^=a^xb^ and Pjj=a^xb^ (12) 

IF a < 0, h > 0; p^=a^xb^j and p^ =a^x b^ 

IF a,b<0: p^=a^jxb^ and p^j = a^^xb^ 

The magnitude of the difference between the upper and lower bounds of an RPR 
produet will always be on the order of 2. Therefore, in multiplication the product may 
be tested the same way as the sum is tested in addition: the first r bits of the precise 
produet should always be a number that is greater than or equal to the produet of the 
lower bounds, and less than the product of the upper bounds. It is important to note that 
this is only true if the eomparison is done before the RPR bound products are rounded as 
deseribed above Equation (10). This is because the addition of 2 to the lower bound 
produet sometimes produees a earry that makes the first r bits of p^[= a^xb^) equal to 


the first r bits of p^[=a^jxb^) , making both bound results greater than the precise 
produet. One example of this failure is demonstrated in Figure 24. 
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X ,0100 (bi) 

. 001110010 (a) 
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001110010 

000000000 

0100 
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+ 0000 

,00001100 (2r) 
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000000000 

000000000 

000000000 

001110010 
+ 000000000 

,000011101110101100 (2n) 
(1) 

.00010100 {2r) 

(1) 

,0001 (?-.,r) 

.00001111 (p,n) 

.0001 (Pe,r) 


Figure 24, Erroneous Bounds Due To Rounding Products, 


As with addition, there are speeial eases in multiplieation that need to be 
discussed. In multiplication these are small operands (those that are close to zero, or less 
than 2) and cases involving true zero. 

a. Special Cases Involving Zero 

The zero property of multiplication states that if the product of two 
numbers is zero, then at least one of the operands must be zero. Conversely, in arithmetic 
zero times any number is zero. Particularly in systems where repetitive multiplication is 
performed on values that may equal zero (e.g., gain multiplication in a control feedback 
loop), it is preferable to test the operands for zero first in order to avoid incorrect accrual 
of nonzero values. If either operand is identically zero, then zero (to precision n and r) 
should be supplied directly as both the precise result and its bounds, rather than executing 
the operation. 


b. Special Cases Involving Small Numbers 

A very small number in RPR multiplication is defined as magnitude less 


than 2 but greater than 2 ". There are two main concerns that arise when dealing with 
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very small numbers in RPR multiplieation: (1) the presenee of zero in bound ealculations 
when neither operand is truly equal to zero, and (2) the generation of an RPR result that is 
several orders of magnitude larger than the precise result when both precise operands are 
very close to zero (i.e., when the precise result has a value much smaller than the smallest 
representable RPR value 2). 

When one (or both) operands has magnitude less than 2, that operand’s 
lower bound will be zero. When either lower bound is equal to zero, the lower bound 

of the product is necessarily also zero. If the RPR product (used when an error is found 
in the precise product) is obtained by finding the arithmetic mean of the upper and lower 
bound products, then the RPR result will be greater than or equal to 2^^'^^'^ and the false 
zero will not propagate. However, the lower bound on the new product will also be zero 
(since the product is always smaller in magnitude than either factor); in this case the next 
operation must continue to avoid obliterating the very small product by ensuring that true 
zero is not propagated as the result. 

When both operands are of magnitude 2^^'^^'^ or smaller, the first 2r digits 
of the product will all be zero. The RPR result in this case will be equal to 2^^'^^'^, which 
may be up to (r - n) orders of magnitude larger than the precise result. However, the 
precision of fixed-point multiplication is such that if there is an error in the precise result 
that is detectable through comparison with the upper bound 2'' and lower bound 0, the 
RPR result will have less error than the incorrect precise result and it will be the best 
possible solution in that case. A value of “true” on a status signal indicating that the RPR 
result was used (i.e., there was an error in the precise result) in multiplication also alerts a 
designer that if the operands were smaller than 2, the RPR product supplied may be 
orders of magnitude larger than the (correct) precise product. 

There is some discussion necessary on the generation of the RPR result. 
The RPR result is found after the computation of the precise, upper and lower bound 
products. The two principal ways to obtain an RPR product are to find the arithmetic 
mean of the full upper and lower bound products (giving a result of precision 2r), and to 

find the arithmetic mean of the rounded upper and lower bound products (giving a result 

52 



of precision r). The first of these methods gives the best result - i.e., the result with the 
least error relative to the correct precise result - but it requires more area to implement 
since it involves operands of size 2r versus size r. The impact of obtaining the best RPR 
result is shown in the demonstration of RPR multiplication. 

3, Demonstration of RPR in Multiplication 

A complete RPR multiplication implementation contains an operation module and 
a voter module, as with addition or subtraction. The main difference between 
multiplication and addition/sub traction is the set of signals passed from the operation to 
the voter. The multiplication voter uses the intermediate results of the upper and lower 
bound computation - of precision 2r - to compare with the first 2r bits of the precise 
product and/or to generate the RPR result. This is depicted in Figure 25. 
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Figure 25, RPR Multiplier Top-Level Block Diagram. 


Using 2r bits instead of r bits to generate the RPR result increases the size of the 

voter, the impact of which is shown in the following section on comparing RPR 
multiplication with TMR multiplication. However, this allows the most accurate RPR 
result to be generated and eliminates the comparison issues associated with rounding the 
product lower bound. 

The operation module structure for multiplication is essentially the same as the 
operation module for addition. The differences between addition and multiplication 
designs include the lack of any overflow flags (since the fractional product always falls 
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within the allowed range -1 < c < 1) and the inerease in output ports required due to the 
propagation of the 2r-width bound results. A block diagram showing these details of the 
multiplication operation module structure is drawn in Figure 26. For comparison, a TMR 
multiplication operation module is shown in Figure 27. 
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Figure 26, RPR Multiplier - Operation Block Diagram, 
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Figure 27, TMR Multiplier - Operation Block Diagram (compare to Figure 26), 


The second RPR module, the voter, is almost the same in structure as the RPR 
voter for addition. The single bitwise majority voter checks the sign bit of the product 
instead of the overflow flag, so that circuitry remains the same. The RPR voters for 
addition and multiplication differ primarily in the comparators and in the generation of 
the RPR result. Since multiplication of two r-bit numbers generates an intermediate 
product of precision 2r, it is easy and facilitates obtaining the most accurate RPR result to 
use the long intermediate products of the upper and lower bounds in the RPR voter. The 
comparators may be either r or 2r-bit operations; 2r-precision comparators detect more 
errors (down to 2^^’’ in magnitude). However, a comparator occupies roughly as many 
FPGA slices as its precision (i.e., a 32-bit comparator maps to 33 slices on an FPGA), so 
if area is at a premium and other processes in the system are detecting errors only to r bits 
of precision, it may be preferable to use only the first r bits of the bound products for 
testing the precise result in multiplication. This approach is shown in Figure 28. 
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Regardless of the bound values used for comparison, the long bound products may be 
used to compute the most accurate possible RPR result . 



Figure 28, RPR Multiplier - Voter Block Diagram, 


The voter module for a TMR operation is the same regardless of the operation 
being performed, since the voter simply takes three copies of the same result and bitwise- 
compares them, correcting any single error. The overflow bit check can be used as a sign 
bit check, although it is not necessary if the sign bit is included in the n-hit product. For 
comparison with the RPR voter, a TMR voter is shown in Figure 29. 
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Figure 29, TMR Multiplier - Voter Block Diagram (compare to Figure 28), 

The RPR result selection block in the multiplication voter is the same as in the 
addition voter, both in operation and input (data and control signals), with one exception; 
the checked/corrected product sign bit is fed into the RPR result selection block 
separately because it is not checked as part of the comparison process. The bound tests 
are executed on only the magnitudes of the precise and bound results; the sign bits are 
checked using a bitwise majority voter, as in TMR. 

The other logic block that is different between the RPR voter for multiplication 
and for addition is the generation of the RPR result. In RPR addition, this block can 
simply extend the lower bound sum from precision r to precision n, with ‘ 1’ in the 
position and ‘0’ in the remaining n - {r + V) positions, as noted after Equation (9). 
However, in RPR multiplication the 2r-bit bound products rarely differ by only 1 x . 
Therefore the RPR result generation block must add the two bound results and shift the 
sum right one bit to get their arithmetic mean, which is then concatenated with additional 
O’s to obtain the n-bit RPR result. This process includes an entire arithmetic operation 
(addition) that itself was investigated in the previous section as a potential object of RPR 
application. Using addition in the voter for multiplication both increases the size of the 
voter and introduces the question of “trusted” operations: if a designer is protecting low- 
level arithmetic operations with RPR, is it necessary to protect the addition in the RPR 
“overhead” (the voter) as well? Or is that a calculated risk that a designer will accept? 
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This is a potential drawback of RPR as compared to TMR as well, which should be 
considered in addition to the quantitative comparison expounded in the next section. 

4. Comparing RPR and TMR Implementations 

Since multiplication is essentially a collection of many addition operations, it is 
reasonable that mapping a multiplication operation to an FPGA should require 
significantly greater area than an addition operation requires. This is indeed the case; in 
fact, the difference is so great that the size of the multiplication operation modules in 
either TMR or RPR implementations fairly dwarfs the size of their respective voter 
modules. This makes the space and power savings of RPR over TMR a more significant 
benefit. The FPGA area required by the components and complete circuits for several 
RPR and TMR implementations of a simple addition operation are presented in Table 6. 


Redundancy 

Type 

Precision 
(or Degree) 

Slice Count 

Operation 

Voter 

Complete* 

TMR 

64 

6378 

64 

6436 

RPR 

32/64 (0.5) 

3208 

226 

3429 

RPR 

16/64 (0.25) 

2400 

130 

2526 

RPR 

8/64 (0.125) 

2194 

101 

2285 

TMR 

32 

1622 

32 

1112 

RPR 

16/32 (0.5) 

816 

114 

926 

RPR 

8/32 (0.25) 

610 

85 

684 

TMR 

16 

413 

16 

427 

RPR 

8/16 (0.5) 

205 

77 

274 


*Complete circuit area is computed independently of “Operation + Voter” 


Table 6, Area Required By TMR and RPR Multiplication Experiments. 


The size of the TMR voter modules for multiplication is the same as for addition; 
this makes sense because the TMR voting operation is the same regardless of the type of 
process that produces the results the voter is operating. In the multiplication experiment, 
the addition required to generate the RPR result was considered to be a trusted operation, 
so RPR or other protection was not applied to it. In practice this assumption would need 
to be revisited. Unlike the voter, a TMR multiplier operation module is ten to fifty times 
larger than a TMR adder module for inputs of the same size. In RPR, the multiplication 

operation modules are also many times larger than the addition modules of the same-size 
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operands, but the spaee saved by RPR multiplieation over TMR multiplieation is greater: 
an RPR multiplieation module requires 1/3 to 1/2 the FPGA sliees of a TMR 
multiplication module, depending on the degree of RPR. The RPR addition operation 
requires 40%-70% of the area of the TMR addition operation. This implies that 
multiplication, being a more complicated operation, is better suited for RPR fault- 
tolerance methods than is addition. 

In order to examine better the relative impact of the operation and voter modules, 
an RPR/TMR area comparison by percent and ratio is presented in Table 7. 


Redundancy 

Type 

Precision 
(or Degree) 

Percent of Circuit 
Occupied by 
Voter 

Ratio of 
RPR/TMR 
Operation Size 

Ratio of 
RPR/TMR 
Voter Size 

Ratio of 
RPR/TMR 
Total Size 

TMR 

64 

1.0% 

(1.00) 

(1.00) 

(1.00) 

RPR 

32/64 (0.5) 

6.6% 

0.50 

3.53 

0.53 

RPR 

16/64 (0.25) 

5.1% 

0.38 

2.03 

0.39 

RPR 

8/64 (0.125) 

4.4% 

0.34 

1.58 

0.36 

TMR 

32 

2.8% 

(1.00) 

(1.00) 

(1.00) 

RPR 

16/32 (0.5) 

12.3% 

0.50 

3.56 

0.83 

RPR 

8/32 (0.25) 

12.4% 

0.38 

2.65 

0.62 

TMR 

16 

3.7% 

(1.00) 

(1.00) 

(1.00) 

RPR 

8/16 (0.5) 

28.1% 

0.50 

4.81 

0.64 


Table 7. FPGA Area Comparison for RPR and TMR Multipliers. 


Of particular interest in Table 7 are the “percent of circuit occupied by voter” 
values. Because the multiplication operation is so large, the least and greatest relative 
sizes of the voter in any TMR or RPR operation range from 1% to 30%. All cases of 
RPR or TMR multiplication have less of the circuit devoted to the voter than any of the 
cases of addition (in which the portion of the circuit occupied by the voter ranges from 
40% to 68%). This means that even in 8/16 RPR, which would not be used in most 
practical applications, the RPR-protected circuit saves significant area compared to the 
TMR-protected circuit (1.00 - 0.64 ~ 35%). 

Although RPR appears to give the expected benefit of 1/3 to 1/2 space and power 
savings when applied to multiplication, it is still worth noting that the voter modules in 
RPR multiplication are extremely large compared to the TMR voter modules. One of the 
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drawbacks to RPR is that not only do the voter modules require more thought and design 
than the simple bitwise majority-voting of TMR, but also they almost always take up 
more spaee. A TMR voter needs only to eompare each bit of three inputs with a few 
logieal gates; an RPR voter must exeeute numerical (value) comparisons of three r-bit 
numbers, generate the RPR solution (whieh may include additional arithmetie 
operations), and ehoose the best output based on the result of the bound value tests. This 
complexity is evident in multiplication, where every RPR voter is larger than its 
analogous TMR voter -sometimes the RPR voter is three to five times greater. Sinee all 
the voter eireuits are still very small and simple eompared to the multiplication operation 
cireuits, this eomplexity of the RPR voters is not as critical as is in the simpler operation 
of addition. However, it cannot be overlooked. It may be possible to optimize further 
and/or standardize the design of RPR voters in order to make them more eomparable to 
the TMR voter, whieh operates on each result bit independently. This is another point 
diseussed in the additional notes on RPR voter design at the eonelusion of this ehapter. 

D, DIVISION 

1. Computation Error in Division 

Division is the most eomplieated arithmetie operation, and often presents a speeial 
problem in digital computing. Most importantly for this discussion, there is no 
synthesizable “divide” operator (e.g., a ! b = q) 'm the standard VHDL arithmetie libraries 
analogous to add (+), subtraet (-), or multiply (*). The division operation must instead 
be synthesized as a eloeked proeess made up of suecessive subtraetions or additions, or as 
multiplieation of a reciproeal of the divisor (a * Mb). This is explored further in the 
division implementation discussion. 

Another significant point regarding division in computers is that in fraetional 
fixed-point arithmetic - the eonvention ehosen for this research - division operations are 
only meaningful when a < b (i.e., the divisor is greater than the dividend). This is 
because in eases where the divisor is smaller than dividend, the resulting quotient q 
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would fall outside the permitted range (i.e., q < -\ ox q > 1). Therefore to exeeute the 
operation at all, the eonstraint a < b must be applied to the domain of possible operands 
[28], 

Assuming that the constraints are met and a suitable implementation is found, the 
computation error in division is similar to that of multiplication. The quotient ^ of a 
dividend a and a divisor b is traditionally calculated one digit at a time, beginning with 
the MSB. The computation of a n-bit quotient is complete when n+\ digits have been 
calculated and rounded to n bits by adding ‘1’ to the LSB, which is the same rounding 
process that is applied to multiplication. This gives the computation equation 

c = — + £, (13) 

b 

where s is the rounding error. As derived in [26], the upper bound on the absolute value 
of the difference between the computed product b*q and the dividend a is therefore 

(14) 

where b is the fractional divisor. 

2, Determining RPR Bounds for Different Division Implementations 

The major methods of implementing division in a computer include basic shift- 
and-subtract schemes (including restoring and nonrestoring solutions); variations on the 
basic schemes such as modular, high-radix (>2) or array dividers; and division by 
convergence. Shift-and-subtract division is difficult to pipeline for quick computation 
due to the need for a conditional subtractor for each quotient bit [29] and therefore 
convergence schemes are more often utilized. Methods of division by convergence 
comprise successive approximation by repeated multiplication, Newton-Raphson 
iteration, and division by reciprocation [27], [28], [29]. 

a. Convergence by Repeated Multiplication 

In division by convergence through repeated multiplication, the quantity 
alb is multiplied by a sequence of factors such that the denominator {b*x\*X2* 
converges to 1. This causes the numerator to converge to the quotient q. The next factor 
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(x,+i) is the 2’s complement of the current denominator The repeated 

multiplication method has quadratic convergence; the number of multiplications required 
to obtain precision n in the quotient is m, where 

m = |~log2n]. (15) 

Because this method consists of several multiplications, it may seem that 
the total error accumulated is an aggregate of the computation error accrued in each 
multiplication. However, if the entire process is carried out as a block operation in both 
precise and RPR versions at the double precision {2n or 2r, respectively) of the 
intermediate products, then the computation error will only occur in the rounding of the 
final result. The final result is obtained after or iterations, and the precise and 

bound results may be compared as in regular multiplication. This leads to a suggestion 
that RPR may be applicable to block operations as well as to individual elements - in fact 
that RPR may be applied at the process level with greater efficiency than at the individual 
operation level. This is discussed further in the section on compound operations (matrix 
multiplication and FFT computation). 

The added value of several multiplication operations and only one voter 
operation at the end of the multiplication increases the relative space and power savings 
of RPR over TMR. The quantitative benefit of using RPR depends on the degree of RPR, 
but will approach the ratios found in column six of Table 7, the comparison of 
multiplication operation sizes. 

b. Division by Reciprocation 

Division by reciprocation consists of finding the reciprocal Mb of the 
divisor b, and then multiplying that reciprocal by the dividend a to obtain the quotient q. 
The reciprocal is commonly found using Newton-Raphson iteration [29], which begins 
with an initial estimate of some precision k and successively applying the formula 

^2 - j (16) 

from [28]. The initial estimate x*^°^ is either stored as a constant or computed based on b, 
to precision r ox n depending on whether the operation is precise or RPR. Finding the 
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reciprocal requires the same basic operations as those that the repeated multiplication 
approach requires: finding 2’s complement and performing multiplication. The Newton- 
Raphson method converges quadratically (the same degree as the repeated-multiplication 
approach), and the computation error is determined in a fashion similar to that of the 
repeated-multiplication approach as well. The main differences between division by 
reciprocation and repeated multiplication are in the initial estimate and final 
multiplication (by the dividend) in the reciprocation method. The number of 
multiplications required in each approach is the same, although in the reciprocal method 
they are performed successively and in the repeated multiplication method they are 
performed independently on the numerator and denominator. The successive 
multiplication in the reciprocal method also increases the potential cumulative error of 
the process - however performing the multiplication at double precision keeps the 
intermediate results accurate and alleviates this concern. Ultimately, division by 
reciprocation is another process that is more accurately computed using parallel block 
operations in precision n and r. 

3. Comparing RPR and TMR Implementations 

If the desired implementation approach for division is any method of convergence 
(as opposed to a shift-and-subtract method), then the number of operations required to 
complete the division can be estimated as a function of the precision required. For 
example, if the required precision of a quotient is n = 32, the number of iterations m of 
the repeated multiplication method is log2(32) = 5, as determined by Equation (15). 

This means that the number of multiplications is 2x5-l = 9 (numerator and denominator 
in iterations 1-4 and numerator only in iteration 5), and the 2’s complement of the 
denominator must be found five times. Using the operation module size ratios in Table 7 
as a guide, it follows that for 8/32 RPR, the TMR implementation would be about 2.5 
times the size of the RPR implementation on an FPGA. Depending on the capacity of the 
FPGA and the desired computation speed, the division operation may be implemented 
using an architecture with fewer multiplication modules; the relative savings of RPR 
would be maintained regardless of how many multipliers are used. 
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The appearance of several successive subtraction, multiplication or 2’s 
complement operations as part of a division operation highlights the question of whether 
to treat blocks of arithmetic operations as a single module to which one applies RPR. 
Scenarios so far have indicated that as RPR is applied to more complicated operations, its 
benefit increases (i.e., size compared to TMR decreases). However, the complicating 
factor of applying RPR to block operations is the determination of upper and lower 
bounds on the results of processes that involve multiple inputs and/or multiple iterations. 
The following section on compound operations addresses this and other questions. 

E, COMPOUND OPERATIONS 

A single adder or multiplier is rarely found in a configuration where it is 
completely isolated from all other arithmetic operations in a computer. In fact, 
computers were developed to execute many successive arithmetic operations - whether in 
a recursive or an iterative architecture. However, examining a many-part operation from 
the perspective of applying RPR generates questions: what are the upper and lower 
bounds for the result of the overall operation? Is there a performance benefit to be gained 
from testing intermediate results for errors? Can the chosen degree of RPR be 
maintained throughout the operation? Conversely, what is the impact of error (loss of 
precision) accumulated over the entire operation? 

The first question to address is whether upper and lower bounds can be computed 
directly for the results of operations more complicated than single instances of addition, 
subtraction, multiplication, or division. The most straightforward cases are when all 
operands are nonnegative. If an entire problem can be scaled such that it both operates 
on and produces only nonnegative numbers, then the case becomes simple: upper and 
lower bounds of the result may be obtained by performing the given operation on the 
upper and lower bounds of the operands, respectively. In unsigned fractional 
representation, any combination of addition, subtraction, multiplication and (constrained) 
division can be computed on precise operands to produce the precise result, on operands’ 
lower bounds to generate the lower bound of the result, and on operands’ upper bounds to 
generate the upper bound of the result. 


64 



Unfortunately, this rule does not hold for multiplieation when one or more 
operands is negative. When negative numbers are introduced, addition and subtraction 
may proceed normally since the relation among those operands’ and results’ upper and 
lower bounds does not change with sign. However, when dealing with negative numbers 
in multiplication the upper and lower bounds of the operands must be rearranged as 
dictated in Equation (12) before being fed into reduced-precision multipliers. This 
rearranging ensures that the reduced-precision operations generate appropriate upper and 
lower bounds for the correct full-precision product. In a process that contains several 
successive multiplications, the signs of all operands must be tested before the operands 
are used so that their bounds are assigned to the appropriate reduced-precision operation 
inputs. In software implementations, this results in a series of statements conditioned on 
the signs of the original (full-precision) operands. In hardware implementations, the sign 
bits of the operands can be used to control a selector at the entry to each reduced- 
precision multiplier (upper and lower bound) that chooses the appropriate inputs for the 
multiplier. 

The second question to answer when considering compound operations for RPR 
is whether there is benefit in testing intermediate precise and bound results, or if only the 
final result should be tested. As shown previously, adding a voter to test an RPR addition 
result generally doubles the number of FPGA slices required to execute the addition 
operation. Multiplication is a larger operation and therefore the relative significance of 
its RPR voter size is smaller, but the absolute RPR voter size in multiplication is even 
larger than in addition. Any benefit of testing intermediate results in RPR processes must 
be considered against the additional space it requires. 

The most significant benefit of testing intermediate RPR results is in locating the 
source of an error more quickly, thereby enabling faster recovery to error-free status. 
Particularly in parallel-processing designs, where many arithmetic modules are used to 
reduce computation time, it may be both desirable and possible to reconfigure part of the 
FPGA that has been affected by an SEU while processing continues on the unaffected 
partitions of the hardware. However, in order (1) to confirm that an incorrect result was 
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caused by a configuration error (as opposed to a data error) and (2) to identify whieh 
partition is affected in the case of a configuration error, it is necessary to loeate the origin 
of the computation error. 

If a compound operation is executed using modules in multiple partitions of an 
FPGA, testing consecutive intermediate results is the only indirect method of narrowing 
the range of possible error locations. The primary method of discovering, loeating and 
correcting configuration errors on an FPGA is by comparing the FPGA operational 
configuration directly to a stored (protected) eopy of the configuration at regular time 
intervals, and reprogramming the FPGA with the stored copy of the configuration when a 
discrepancy is found. If intermediate result checking can be used as a passive way to 
locate whether, when and where configuration errors occur in a circuit, it may enable 
more efficient operation of the FPGA. RPR voters may be inserted at the output of any 
single operation where a precise result and two bound results are generated, and 
additional signals may be set to trigger reconfiguration if two consecutive errors are 
encountered. However, the quantitative benefit (e.g., computation speedup) of detecting 
and locating errors in this manner needs to be explored through further study that focuses 
specifically on that issue. 

The third question generated while designing RPR protection for compound 
operations is how to quantify the error - or loss of precision - accrued over multiple 
calculations. This is particularly important in RPR-protected systems because the degree 
of RPR represents the greatest acceptable loss of precision; any further reduction in 
precision due to aceumulated error may cause a signifieant decline in system 
performance. The amount of error accumulated in compound operations depends 
primarily on two things: the number of addition operations in the set, and the internal 
precision of the operation - i.e., the number of guard bits in each calculation. Examples 
of the error accumulated in two types of compound operations are shown in the following 
subsections on matrix multiplication and FFT computation. 
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1 . 


Matrix multiplication 


Matrix multiplication consists of computing the inner product of two vectors for 
each matrix element. The inner produet is defined as 

N 

p = Tj^iyi ( 17 ) 

i=i 

where x^ and y. are eorresponding elements of the two veetors (or row and eolurnn 
elements of two matrieial operands) and have preeision n. Assuming eaeh addition adds 
a maximum possible error = 2 ”, the maximum aggregate error in eaeh element of a 

produet matrix of size N is (^inax)totai ~^ or N times the smallest 
representable number of preeision n. This maximum total error reduees the preeision of 
the solution by m = logj (A^) bits. Therefore, in order to maintain n signifieant bits of 

preeision in the solution to an inner produet or matrix multiplieation of size N, at least m 
guard bits - extra bits of preeision - must be earried through the operation. The final 
result is then rounded to n bits as with simple addition or multiplieation. Exeeuting the 
eombination multiplieation-addition operation at double preeision (2n), as deseribed 
previously for RPR multiplieation, provides enough guard bits to ensure that the only 
signifieant error aeeumulated is the s < 2 " generated in the rounding of the final result. 

a. Determining Bounds for Matrix Multiplication 

Determining proper bounds on the result of eaeh inner produet 
depends on the signs of the input veetor (or matrix) elements x^ and y.. The 

arrangement of bounds for eaeh multiplieation part of the operation is deseribed in 
Equation (12). The addition parts of the operation require no speeial treatment - 
provided the addends are already in two’s eomplement form, or eonverted to that form, 
before performing the addition. A eomplete matrix multiplieation proeess developed 
using MATEAB 2007 Release A is ineluded in Appendix B. The treatment in Appendix 
B multiplies numbers of arbitrary sign and random fraetional values. It eoneludes by 
eomparing upper and lower bound results to the preeise result and showing that the 
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algorithm for choosing RPR operation inputs generates the correet upper and lower 
bounds on the inner produet result. 


b. Comparing RPR and TMR Implementations 

Finding the inner produet of two veetors of length / requires / 
multiplications and / - 1 additions. (Alternatively there may be / additions if the first 
product is added to a zero register, as in an accumulator.) Matrix multiplication is the 
eomputation of m inner products, where m is the number of elements in the produet 
matrix. If the implementation design contains one multiplier and one adder, meant to be 
used many times, the spaee savings of RPR over TMR is on the order of the spaee 
savings of one multiplieation operation, sinee multiplieation is a mueh larger operation 
than addition. This is shown in Table 8. 


Redundancy 

Type 

Precision 
(or Degree) 

Approximate Slice Count Projection 
(Implementation: multipliers/adders/voters) 

1 / 1/1 1 / 1/2 inn inn 

TMR 

64 

6542 

6599 

6477/ + 65 

6542/ 

RPR 

32/64 (0.5) 

3390 

3610 

3275/+ 115 

3390/ 

RPR 

16/64 (0.25) 

2546 

2672 

2451/+ 95 

2546/ 

RPR 

8/64 (0.125) 

2295 

2385 

2237/+ 58 

2295/ 

TMR 

32 

1706 

1195 

1673/+ 33 

1706/ 

RPR 

16/32 (0.5) 

929 

1040 

851/ + 78 

929/ 

RPR 

8/32 (0.25) 

679 

752 

637/+ 42 

679/ 

TMR 

16 

457 

470 

440/+ 17 

457/ 

RPR 

8/16 (0.5) 

258 

326 

224/ + 34 

258/ 


Table 8. Projected FPGA Area Required for Matrix Multiplication. 


Depending on the amount of error eheeking desired, matrix multiplieation 
may be implemented with a voter after eaeh ealeulation (one multiplier, one adder, two 
voters) or after the eompound operation (one multiplier, one adder, one voter). Bounds 
of the result in either ease (one operation or two in sequenee) are easily determined. If 
the operation is implemented with / multipliers and one or more adders, the spaee savings 
of applying RPR over TMR inereases with /. As discussed previously, the number of 
intermediate results cheeked for errors - i.e., the number of voters - is a design decision. 
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2 . 


Discrete Fourier Transform and Fast Fourier Transform 


The DFT is expressed as 

= (18) 

y=i 

where X is the output sequence of frequency terms corresponding to the set x of time- 
domain input samples. Implementing a DFT generally includes generating a lookup table 
of twiddle factors and then completing successive multiplication and addition or 

subtraction operations to obtain the output frequency-domain terms from the input time- 
domain terms. In MATLAB© the DFT is calculated using the Cooley-Tukey FFT 
algorithm, which reduces the series calculation to repetitive butterfly operations with 
different twiddle factors, as described in Chapter IT In this research, RPR bound 
determination was considered for both the DFT block operation and the FFT butterfly 
operation. 


a. Determining Bounds for a DFT 

To find each output point of the A-point DFT shown in Equation (18), the 
tasks are; retrieve a multiplication factor from a lookup table, multiply an input value by 
the factor from the table, and add the result to the accumulating sum. In practice, twice 
as many operations are performed as are represented symbolically because the factors 

are complex; this requires that the real and imaginary parts of each result be calculated 
separately. Like in matrix multiplication, the arrangement of upper and lower bounds at 
the input to each multiply operation depends on the sign of the operands. In this case, the 
three possible operands are the y'th input sample, the real part of the associated twiddle 
factor Wj^, and the imaginary part of A complete DFT process generated using 
MATLAB 2007 Release A that takes a sequence of arbitrary length, determines input 
upper and lower bounds, performs the transforms and checks the validity of the precise, 
upper and lower bound output values is included in Appendix B. 
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b. Determining Bounds for an FFT Butterfly Operator 


The butterfly operator is the smallest repeating unit of operation in an 
FFT. The structure for a radix-2 butterfly, shown first in Chapter 11, is repeated here for 
convenience (Figure 30). The twiddle factors are obtained from a lookup table as 
with the full DFT. 



Figure 30, Radix-Two FFT Butterfly Operation (from Figure 8), 

The butterfly operation comprises one multiplication, one addition and one 
subtraction. The subtraction may be executed by adding the two’s complement of the 
multiplication result Wj^b to the addend a. The real and imaginary parts of the results 

a + Wj^b and a - w^^b must be computed separately from the real and imaginary parts of 
the three inputs a, b and . The bounds on the results are calculated from the bounds of 
the inputs, arranged depending on the signs of each input as stated in Equation (12) for 
multiplication and Equation (7) for addition/subtraction. The factor must also be 
rounded for the reduced-precision calculations; either the rounded value may be used as 
both “upper” and “lower” bounds, or upper and lower bounds on may be computed 

and used in the product bound computations. If the single rounded value is used, the 

resulting reduced-precision products will still be upper and lower bounds on the precise 
product as long as the upper and lower bounds of input b are used correctly. Equation 
(19) is a set of conditional equations that demonstrates the relationship between the signs 
of the real and imaginary parts of the operands b and and the upper and lower bounds 

of the product Eor the sign bits Sx, ‘0’ implies positive or zero and ‘1’ implies 

negative. 
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After computing the bounds on the product Wj^b, the results are used to 
compute the upper and lower bounds on the operations a + Wj^b and a - Wj^b (or the 
two’s complement operation, mathematically a + {T - w^b'^) as determined by Equation 

(7). Because the operands are complex, the two operations are actually four arithmetic 
operations (adding real and imaginary parts separately for each addition). The bounds for 
each addition are calculated in the same way. 

c. Comparing RPR and TMR Implementations 

A fast Fourier transform butterfly operation contains one complex 
multiplication and two complex additions. Separating the real and imaginary parts of a, 
b, and w for computation yields four multiplications, three additions and three 
subtractions (or four multiplications, six additions and two 2’s complement operations) as 
shown in Equation (20). 
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(w +w.i)(b +bi)(w +w.i) = (wb - w.b.) + (w b. + wb )i 
(a + ai) + ({wb)^ + {wb). i) = (a + {wb)^) + (a. + {wb).)i 
{a^+a.i)-({wb)^+{wb)j) = {a^ -{wb)^) + (a.-{wb).)i 


—>• 4 mult, 1 add, 1 sub 
—>• 2 add (20) 

—>• 2 sub (complement/add) 


Depending on the amount of error checking required, a butterfly operation 
may include one or more voters on the final or intermediate results. Since the butterfly 
operator is the basic building block that is copied many times to build an FFT calculator, 
one obvious approach is to treat the butterfly as a block operation, and simply append two 
voters (one each for the real and imaginary parts of the result) to each RPR butterfly 
machine. However, it is possible to include a pair of voters after the complex products 
(wj^b)^. as well, or even after each individual multiplication w^b^. An option at the 


other extreme is to check only the final result for each FFT output point, after the data 
have passed through all levels of butterflies. Based on the area requirements for 
multiplication and addition determined in Table 4 and Table 6, a comparison of projected 
requirements for a few butterfly operation designs is shown in Table 9. 


Redundancy 

Type 

Precision 
(or Degree) 

Approximate Slice Count Projection 
(Implementation: multipliers/adders/voters) 
4/6/2 4/6/4 4/6/8 4/6/0 

TMR 

64 

26,236 

26,366 

26,622 

26,106 

RPR 

32/64 (0.5) 

13,464 

13,694 

14,598 

13,234 

RPR 

16/64 (0.25) 

10,096 

10,286 

10,806 

9,906 

RPR 

8/64 (0.125) 

9,150 

9,266 

9,670 

9,034 

TMR 

32 

6,860 

6,926 

7,054 

6,794 

RPR 

16/32 (0.5) 

3,630 

3,786 

4,242 

3,474 

RPR 

8/32 (0.25) 

2,686 

2,770 

3,110 

2,602 

TMR 

16 

1,848 

1,882 

1,946 

1,814 

RPR 

8/16 (0.5) 

1,002 

1,070 

1,378 

934 


Table 9. Projected FPGA Area Required for Radix-Two FFT Butterfly Operation, 


Essentially, the space savings for RPR as compared to TMR in an FFT 
butterfly operation are on the order of 50% for high degrees of RPR (32/64 or 16/32). To 
gain greater advantage in size, choosing a smaller degree of RPR (such as 8/32 or even 
8/64) gives a space savings approaching 66% (that is, the RPR operation takes only 1/3 
the space of the TMR operation). This is consistent with the findings of Snodgrass in [3]. 
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The presence of many voters in the design does not significantly change the area 
requirement in cases of high precision in the main operation (i.e., n = 32 or greater). 

F. FURTHER DISCUSSION OF ERRORS AND THE RPR VOTER 

1, Additional Considerations for RPR Voters 

One of the results of the experiments conducted in this chapter was a table of the 
area required by the voters/error checkers for TMR and various degrees of RPR for 
multiplication and addition. For the purpose of comparing across operations, the voter 
area requirements are shown in Table 10, without the requirements for the corresponding 
operation modules. 


Redundancy 

Type 

Precision 
(or Degree) 

Slice Count 

Addition 

Voter 

Multiplication 

Voter 

TMR 

64 

65 

64 

RPR 

32/64 (0.5) 

115 

226 

RPR 

16/64 (0.25) 

95 

130 

RPR 

8/64 (0.125) 

58 

101 

TMR 

32 

33 

32 

RPR 

16/32 (0.5) 

78 

114 

RPR 

8/32 (0.25) 

42 

85 

TMR 

16 

17 

16 

RPR 

8/16 (0.5) 

34 

77 


Table 10, Area Required By TMR and RPR Multiplication Experiments. 


The voters constructed for the multiplication operations in this research compared 
the full intermediate product of precision 2r with the value in the first 2r places of the 
precise result. This required the comparators in the RPR multiplication voters to be twice 
the size of the comparators in the RPR addition voters, which in turn made the 
multiplication voters approximately two times the size of the addition voters. Much of 
this increase in size can be improved by comparing only the first r bits of the products, as 
discussed in the multiplication section. 
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Another requirement for RPR implementation that is not represented in these 
results is the logie to test the signs of the operands and ehoose the eorreet input for the 
upper and lower bound ealeulations. This applies to multiplieation and, if a separate 
subtraetor is desired (instead of adding the 2’s eomplement of the subtrahend), to 
subtraetion. The seleetors implemented to ehoose the final result (preeise or RPR) in 
each RPR voter occupy as many slices of FPGA logic as there are bits in the output 
product - therefore it is likely that the area required by each input selector is as many 
slices as the precision of the input value (e.g., 8 for 8/64 RPR). In a small degree of RPR, 
this number is trivial compared to the total size of the operation (over 2000 slices for r/64 
multiplication). 

2. Error Detection with RPR 

An SEU causes an error in an FPGA if and only if the fault it produces - by 
changing a logic value in configuration or data memory - directly or indirectly changes 
one or more output values. When a fault occurs in an RPR-protected addition or 
subtraction operation, there are eight possible scenarios; the error-free case and seven 
different types of errors. Figure 31 shows the eight scenarios as first defined by 
Snodgrass [3], emphasizing the differences among them. 
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Figure 31, Possible Error Scenarios in RPR (After [3]), 

Of the seven types of errors depicted in Figure 31, only four (the four cases where 
any two lines cross) are detectable using an RPR voter with the comparison methods 
described in this chapter. Of the remaining three cases, two are trivial; when there is an 
error in either bound result that maintains the correct relative magnitude of the upper, 
exact and lower results, the (correct) exact result will be used and no error will be 
propagated. The third undetected error, which occurs in the exact result but is small 
enough not to disturb the upper/exact/lower relative magnitudes, is by definition still 
within the tolerance described by the degree of RPR in the system. 

In order to determine whether an error occurred in data or configuration memory, 
it is necessary either to check the FPGA configuration directly (if no discrepancies are 
found, the error was in data), or to keep track of errors in the operation output for more 
than one cycle. In general, an SEU occurring in data memory will only affect a single 
final output, although the error will appear in all intermediate results involving that 
particular value after the error occurred. In contrast, an SEU occurring in EPGA 
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configuration memory may or may not cause an error in the operation output. There are 
usually “extra” bits in FPGA eonfiguration memory whose state is irrelevant to the 
proper operation of the eireuit in most or all oases. Most of these are unused logio - if an 
SEU affeots an unused FPGA shoe, no error ooours. However, if a fault ooours in a shoe 
of programmed logio and does oause an error, every value that flows through the faulty 
operation is affeoted. These error olasses and the effects of the different errors are 
reiterated and disoussed further in Chapter IV. 

Whether the method of redundanoy ohosen is TMR or RPR, the designer of a 
system must deoide whether oheoking output for suooessive errors is the preferred method 
of deteoting eonfiguration errors. If it is, the output oheoks may be implemented in an 
RPR system by using the three “oomparison error” output signals set by any RPR voter. 
These three signals may control or trigger a scrub and reconfiguration of all or part of the 
FPGA. The additional logie needed to exeeute the eonditional scrub must be traded 
against the time required to do eonfiguration scrubs at regular intervals without being 
called by an error signal. 

Regardless of the time required to deteet and correet FPGA eonfiguration errors in 
a system, the minimization of the errors’ impaet to system performanee is the true 
measure of sueeess in fault tolerance. Chapter IV investigates some of the effects of 
using RPR on the performanee of satellite eontrol and sensor systems. 
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IV. EVALUATING RPR PERFORMANCE 


The RPR operations in Chapter III demonstrate that sometimes a degree of RPR 
of 0.125 or smaller is needed to achieve significant FPGA area and power savings. This 
is a notable reduction in precision. To illustrate, consider that the smallest representable 
value in signed 64-bit fixed-point format is on the order of lx 10'^^. The smallest 
representable value in signed 8-bit fixed point format (which is the precision of the 
bounds in 8/64 RPR) is 0.0039, or 4x10' . Because the reduction in precision may be 
great, it is important to explore its impact on the performance of a system protected using 
RPR. This can be done using numerical simulations, with the occurrence of errors 
(triggering subsequent use of RPR results) modeled as an increase in system noise. 

A. MODELING ERRORS DUE TO SINGLE EVENT UPSETS 

1, Classes of Errors in FPGAs 

It is important to remember that there are two distinct classes of faults that may 
cause errors in an FPGA-based system. A data fault occurs when a charged particle 
changes the value or one or more bits in data memory. This causes a transient error, so 
named because it only lasts as long as that particular piece of data is in the system. In a 
process like the FFT, where data continually flows through the system in a single 
direction, a transient error affects one or more output values depending on how early in 
the computation process it occurs. In a recursive process like an ADCS, a data error may 
have greater impact because some data is “remembered” - for example, the system state 
vector is maintained in memory and updated with every execution of the control loop. 
However, the state vector is also updated every cycle with fresh input data from the 
system attitude sensors. This could overwrite any errors in the stored system state vector 
depending on the ADCS architecture. 

The second class of fault that can occur in an FPGA is in the configuration of the 
FPGA. A configuration fault occurs when a charged particle changes the value of one or 
more bits in the memory that holds the FPGA configuration. In the Xilinx® Virtex™ 
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XCV600 FPGAs used on CFTP, there are 6,127,744 configuration bits and only 98,304 
total block SelectRAM bits [25], so it is much more likely that a randomly incident 
charged particle will affect a configuration bit than a data bit. A configuration memory 
fault has the potential to inflict much more damage on a reconfigurable system than a 
transient fault could cause, because a configuration fault has the potential to generate an 
error in every value that it touches - it is persistent. A configuration fault may also create 
situations that are not even possible within the mathematical or logical rules governing 
the system operation; since it changes a random bit in a random lookup table (LUT) or 
other functional element of the FPGA, it may eliminate a potential output state, change 
the role of an input/ouput (I/O) buffer, or create other hazardous conditions outside the 
“logical” possibilities. 

Both classes of faults manifest themselves as errors in the output values of a 
reconfigurable computer. However, the propagation and extent of the errors caused by 
data and configuration faults are different. When inspecting the instantaneous output of a 
system, redundant results (TMR or RPR) may be compared to detect and correct an SEU- 
related error. However, whether the SEU caused a data error or a configuration error is 
not evident until two or more successive sets of output are examined. A transient error 
due to a data fault may propagate through multiple steps in a processor, but within a finite 
number of clock cycles the error becomes obsolete and no longer affects the computation 
results. A persistent error due to a configuration fault is not corrected until part or all of 
the EPGA is reconfigured. Detecting the presence of either fault may be done by 
flagging errors in the system output; distinguishing between errors caused by a 
configuration fault and those caused by a data fault is much more difficult, but may be 
accomplished by integrating output errors over a number of clock cycles. 

2. Modeling Errors as Noise 

Consider a signal that is digitally sampled at regular intervals with precision 
n = 32 . Ideally the smallest representable energy level in the sampling regime, assuming 
fixed-point fractional representation, is 2^^^ =2.328x10 volts. If the signal processor 
has no fault tolerance and a data or configuration error occurs, a given output could be 
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incorrect in any or all digits; the magnitude of the error ranges from 2 to 2 ', with 
maximum error magnitude |^| < 0^2= (O.l)^ =(0.5)^^. 

If the same signal is sampled with reduced precision r = 8, the smallest 
representable energy level is 2^* =3.91x10^^ volts. If a signal processor has RPR fault 
tolerance and an error occurs, the RPR result will be used. The RPR result is guaranteed 
to be correct to the rth bit, i.e., maximum error is |^| < = 1 x2^^ = (0.001953)^^ 

volts. The r-bit RPR result sacrifices r - n bits of precision when it must be used instead 
of a correct full-precision result. However, the RPR result preserves r bits of precision 
when the alternative is no fault tolerance. 

The difference in magnitude of the error in a system with no fault tolerance and 
the error in a system with RPR fault tolerance can be expressed using the signal-to-noise 
ratio (SNR) concept. When there is no fault tolerance in a system, the “noise” that 
corrupts a result due to an error has potentially the same magnitude and power as the 
signal itself The maximum possible error |^| = (0.5)^^^ translates to =3 dB for 

worst-case error scenarios. When an error is detected in an RPR system and the RPR 
result is used, the loss of precision in the RPR result is analogous to noise at a lower 
power. Low-power noise essentially randomizes values in the LSB of the result, making 
fine resolution of the signal impossible; however it does not affect coarse signal 
resolution, which is found by interpreting the MSB of the result. In the example system 
(8/32 RPR), the relative noise power represented by the RPR result gives 
SNRj^pj^ = lOlogjQ (1/0.001953) = 27.1 dB . This procedure can be applied to any degree 
of RPR to determine the equivalent noise introduced by using the RPR result. 

3, Experiment Details 

The experiments in this chapter were conducted using MATLAB Release 2007a 
with Simulink version 6.6. Machine epsilon s for the system configuration used was 
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2.2204x10approximately equivalent to n = 52 in fixed-point representation^. The 
degrees of RPR simulated in this experiment therefore follow the form r/52. Errors were 
simulated using either a random number generator with output scaling (to represent an 
uncorrected error or an RPR result), or an additive white Gaussian noise (AWGN) 
channel simulator with SNR corresponding to the desired reduction in precision. In the 
AWGN channel simulator, the representative input signal power was calculated using 
Xilinx Virtex™ XQVR600 FPGA power information [30] as a guide; this experiment 
was based on signal strength of 2.5 V at 10 mA for a total of 25mW simulated input 
signal power. 

B, EVALUATING PERFORMANCE IN SPACECRAFT ADCS 

The example attitude control system chosen for RPR performance evaluation was 
the NPS Bifocal Relay Mirror Satellite (BRMS) Simulator (BRMSS) model, developed 
in 2005 by Kim [31] and described in detail in Chapter V. Figure 32 shows the block 
diagram of the complete system model. 


EuIm Anqt** 



Figure 32, BRMS Simulator System Model (From [31]), 


The shaded blocks in Figure 32 represent key subsystems of the ADCS model. 
From left to right, they are (1) the gravity torque model, (2) the Euler dynamics model, 
(3, top) the quaternion kinematics conversion, (4, bottom) the attitude and rate command 

2 Although MATLAB actually uses double-precision floating-point representation, these experiments 
are meant to simulate performance impact by examining numeric results at a level far above machine 
implementation. Therefore it is not a concern that the operations are not performed using the fixed-point 
designs specified in the previous chapter. 
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processing, (5) the quaternion feedbaek eontroller, and (6) the eontrol moment gyro 
(CMG) steering law. When an experiment is run with the BRMSS hardware, the physies 
of the hardware replaees bloeks (1) through (3) and the operator eommands a trajeetory 
through bloek (4). The heart of the ADCS is in bloeks (5) and (6) - the eontroller and the 
eontrol alloeation (CMG steering law). 

One of the most sensitive points in a satellite ADCS is alloeation of the eontrol 
eommand to the aetuators - in this ease, CMGs. CMGs are heavy, delieate maehinery 
that apply very high torque to a spaeeeraft relative to the power they require to operate. 
Commanding a set of CMGs in a manner unsuitable for their operation ean eause both 
temporary and permanent damage to the aetuators or their housing, as well as loss of 
eontrol of the spaeeeraft. For this reason, the worst-ease scenario ehosen for injeeting 
error into the BRMSS ADCS model was between the eontroller and the CMG steering 
law: the eontrol eommand. 

The eontrol eommand in the BRMSS model is a three-element veetor generated 
by the eontroller that represents torque in the roll, pitch and yaw directions. The eontrol 
eommand is eonverted by the CMG steering law into power levels supplied to eaeh 
aetuator that sum to produce the desired total torque. If any single eomponent of the 
eontrol eommand is in error, that eomponent of the eorreetion torque generated by the 
aetuators will force the spaeeeraft to point or rotate away from the desired state. If the 
error is large, it is possible to drive the system past the limit of eontrollable error. At this 
point, eontrol of the spaeeeraft is lost and eannot be reeovered without eontingeney 
operations. 

In this experiment, the three eomponents of the eontrol eommand were passed 
through independent additive white Gaussian noise (AWGN) ehannels. Aetivation, level 
and timing of the noise on eaeh ehannel was eontrolled by variables set in a MATLAB 
seript and ealled by the funetions in the model. Sinee the experiment was meant to 
simulate a single event effect, only one noise ehannel was aetivated for any given trial. A 
diagram of the error injection system is shown in Figure 33. 
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Figure 33, BRMSS Model Control and CMG Allocation with Error Injection, 


The scenario used for the RPR performance evaluation was a standard “reference 
maneuver,” where the spacecraft began at rest in orientation = (0°,0°,0°), and 

was commanded to move to orientationand stop. The maneuver with no fault- 

induced errors is executed in less than twenty seconds, which includes the time required 
for the system to settle to within two percent of the target orientation (Figure 34). The 
control command with no error over the course of the maneuver is shown in Figure 35. 
In Figure 34 and Figure 35, the roll (X) and pitch (Y) trajectories are coincident; only the 
yaw (Z) trajectory is different. This is due to the moment of inertia (MOI) of the 
simulated spacecraft, which is larger about the Z axis (than about the X or Y axes. 
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ADCS Reference Maneuver With No Error - Angular Position 



Figure 34, ADCS Reference Maneuver with No Error - Angular Position, 


ADCS Reference Maneuver With No Error - Commanded Control Torque 



Figure 35, ADCS Reference Maneuver with No Error - Commanded Control, 

The error scenarios tested with the ADCS model included both transient (discrete 
delta function 5) and persistent (step function) error models. Both types of errors 
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occurred at five seconds into the simulation {t = 5). For each type of error, the reference 
maneuver was executed with SNR levels equivalent to no fault tolerance [SNR = 3 dB), 

and RPR fault tolerance for r = 8, 16, 24, and 32 {SNR = 27, 51, 75, and 99 dB, 
respectively). The third independent variable in the tests was the element of the control 
command into which the error was introduced; 7^,7),, or 7^. Corrupting the X, Y, or Z 

element of the control command generated highly correlated effects in the response of the 
simulated satellite through roll, pitch and yaw angles, respectively. 

Rather than present an exhaustive collection of data from the thirty scenarios 
tested, the most significant results are included here. The first notable outeome of the 
tests is the conclusion that at the low operating power level assumed for this ADCS (25 
mW), a transient data fault can generate an error only great enough to change the 
magnitude of the system transient response - it cannot change the steady-state properties 
of the system. For example, in Figure 36 an instantaneous unbounded error has been 
injected into the X element of the commanded control vector at 5 seconds (shown by the 
sharp spike in Tj. control at t = 5). 



Figure 36, Unbounded Transient Error Effect on BRMSS Reference Maneuver, 
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The effect of this spike is a steep rise and increased overshoot in the roll of the 
spacecraft - the peak roll is {zJ = 1.4° instead of the nominal 1.1°, which is a 300% 
increase in percent overshoot and 27% increase in maximum attitude change of the 
spacecraft. However, the position settles to its steady state at (l°,l°,l°) in the same 

amount of time that it took to settle when there was no error to counteract. This result 
was the same in all tests run with transient errors. The worst-case transient error causes 
the spacecraft to experience greater torques, faster motion and more total rotation than in 
the maneuver with no error. Raising the simulated input signal power above the nominal 
25 mW increases the overshoot of the system still further, but does not change the timing 
of the maneuver. Since the error is transient, the feedback control system corrects the 
error in the next cycle of the feedback loop. There is no lasting effect, but the spacecraft 
parts and assembly must be rated to withstand effects due to an envelope of higher 
torques that may occur when the ADCS corrects for transient errors in its data path. 

If components rated for high torques are not practical, using RPR is one way to 
reduce the effect of a transient error. Figure 37 shows the response of the system attitude 
and control when RPR of degree 8/52 is applied; the difference between the RPR case 
and the case with no error injected (Figure 34 and Figure 35) is almost imperceptible. 



Figure 37, Transient RPR Result Effect on BRMSS Reference Maneuver (r = 8), 
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When a persistent error is introduced to a control system, the effect is much more 
significant than that of a transient error. Since the source of a persistent error is in the 
system configuration, a persistent error corrupts data in every execution of the feedback 
loop and disturbs every control command sent to the actuators. A representative example 
of the effect of a persistent error in the X element of the control command is shown in 
Figure 38. To ensure that the system did not eventually converge, the simulation time 
was extended from 50 seconds to 500 seconds for some trials of the persistent error 
scenario. The system did not settle: the noisy oscillation in the control and corresponding 
motion in roll angle maintained their average magnitudes over time (Figure 39). 


Unbounded Persistent Error in T^ - Angular Position 



Unbounded Persistent Error in T - Commanded Control 



Figure 38, Unbounded Persistent Error Effect on BRMSS Reference Maneuver, 


When RPR is applied to the system, the magnitude of the error is dramatically 
reduced. While a configuration fault in the unprotected system causes unbounded error 
that makes the system completely unusable, a configuration fault in the system protected 
with RPR generates errors whose magnitude is strictly bounded, and even a small degree 
of RPR (r = 8) shows marked improvement in the system trajectory (Figure 40). 


86 





















Unbounded Persistent Error in - Angular Position (500 s) 




Figure 39, Extended Simulation of Unbounded Persistent Error, 



Figure 40, Persistent RPR Result Effect on BRMSS Reference Maneuver (r = 8), 
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The scenario in Figure 40 was run with SNR equivalent to RPR of degree 8/52 
(SNR = 27 dB) applied to the Y component of the control command. (The effect was 
similar when RPR was applied to the X and Z components, so those results are not 
shown.) The response for RPR 8/52 is still not satisfactory for the fine pointing 
requirements of the BUMS, but could potentially provide an adequately steady state to 
operate a spacecraft with less stringent attitude control mission requirements (e.g., an RF 
communications satellite). However, when the scenario was run with SNR equivalent to 
RPR of degree 16/52 (SNR = 51 dB) both control and angle trajectories were virtually 
indistinguishable from the error-free case. This is shown in Figure 41 (compare to Figure 
34 and Figure 35). 



Figure 41, Persistent RPR Result Effect on BRMSS Reference Maneuver (r = 16), 

These results show that RPR has the potential to supply enough precision in a 
reduced-precision solution to allow continuous operation through both transient and 
persistent errors, even in finely-controlled satellites. 
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One factor that should be addressed in the next treatment of this subject is the 
bandwidth of the control system - i.e., the speed with which the system state is updated. 
The scenarios in this chapter were designed to model a “worst case” scenario, so the 
simulations were first run with a fixed sample time of 0.25 seconds. This translates to a 
state update four times per second, which is slower than most ADCS for three-axis- 
stabilized spacecraft. Therefore, several simulations were run again with a fixed sample 
time of 0.025 seconds, which is more realistic (and even optimistic) for modem ADCS 
designs on reconfigurable computers. The overall effect of the faster system was better 
control responses, noticeable in both the transient and persistent error cases. With the 
0.025 sample time, the RPR 8/52 scenario was markedly smoother due to the higher state 
update rate. However, some oscillation was still present (see Figure 42), so RPR 16/52 
would still be wise for systems like the BRMS that require fine pointing accuracy or jitter 
control. 


RPR (r = 8) Persistent Error in - Angular Position 
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Figure 42, Small Timestep, 
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c. 


EVALUATING PERFORMANCE IN SOFT RADIO SYSTEMS: EFT 


The level of complexity chosen for evaluating system performance using RPR in 
a soft radio system was the FFT example function described in Chapter II. For numeric 
simulation, the function chosen was the magnitude-FFT, whose final result is the 
magnitude of the complex FFT output. In order to model single error injection, the input 
sources needed to be managed individually. Therefore the point sizes chosen for the FFT 
(number N of samples in a computation batch) were small: for this experiment, V = 8 and 
N = 16. The model constructed for the 8-point FFT is shown in Figure 43; the script 
developed to run the model and conduct analysis on the data is included in Appendix C. 
The 16-point model (not shown) was identical in architecture, but had 16 input channels 
(including AWGN channel emulators and switches) instead of eight. 



Figure 43, FFT Error Simulation Model, 
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Since the best method of exeeuting the main experiment was to test the entire FFT 
as a bloek operation, the propagation of a single error through an FFT needed to be 
confirmed before the main experiment. Error propagation through one FFT was tested 
separately using a model constructed of four levels of butterfly machines (Figure 44). 



Figure 44, FFT Constructed With Four Levels of Fixed-Point Complex Butterfly 
Machines, N = 16 (Error Injected at Level 2), 


Figure 44 depicts a 16-point FFT constructed of complex butterfly operations that 
follow the rules described in Chapter III. This FFT was tested using sets of 16 random 
fixed-point inputs on the interval (-1, 1) with an “error” applied to one input. The 
system was tested with the single error in different loeations in each level to understand 
the propagation of error through the FFT. In summary, the eloser an error is injected to 
the beginning of the FFT, the more output values it will affect. This is logical, since in 
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each level of an FFT the BFMs operate on different combinations of the inputs until each 
output is a summation of all the other inputs. In order to model the worst-case scenario 
for system performance, the error must be introduced as early as possible in the FFT, i.e., 
at one of the original inputs. This also aligns with the model of applying RPR at the 
block operation level: if one of the inputs to this FFT comes from another operation that 
was deemed to be faulty, it will already have effective precision r and will therefore 
affect the precision of the entire FFT. The degree to which a single erroneous input 
affects the FFT output is shown by the main FFT experiment. 

Testing for the main experiment included repeated execution of 100,000 trials 
(sets) of 8- and 16-pomt FFTs with random input sets (see Figure 43). A series of trials 
was run with SNR set equivalent to each of r = 0 (no fault tolerance), 8, 16, 24, and 32. 
A single error was injected by selecting the AWGN channel for one of the input values 
instead of the direct channel. In this model, one full FFT was computed at each timestep 
- this reduced the difference between the effect of a transient (data) fault and persistent 
(configuration) fault to whether the noise was injected for multiple FFT trials. In other 
words, the equivalent error in one FFT was the same whether it was a single data error 
due to a data memory fault, or one of several data errors appearing in succession as they 
would in the case of a configuration memory fault. 


The results of this experiment are presented in two ways: Table 11 shows the 
relative error in a representative set of FFT output points for the full range of SNR 
settings. Figure 45 and Figure 46 show this data as a series of points for each SNR 


setting r whose y-axis values are the relative error in the representative output points. 


FFT 


SNR 

Err In 

Error Out (%) 

size (N) 

r 

(dB) 

(%) 

TTOl 

TTH 

TT21 

TT31 

TT41 

TT51 

8 

0 

3.00 

187.0% 

46.68% 

5.99% 

6.03% 

5.96% 

46.01% 

N/A 

8 

8 

27.09 

9.03% 

5.09% 

0.37% 

0.37% 

0.38% 

5.82% 

N/A 

8 

16 

51.18 

0.47% 

0.19% 

0.02% 

0.02% 

0.02% 

0.25% 

N/A 

8 

24 

75.26 

0.027% 

0.0173% 

0.0015% 

0.0015% 

0.0014% 

0.0125% 

N/A 

8 

32 

99.34 

0.002% 

0.0007% 

0.0001% 

0.0001% 

0.0001% 

0.0006% 

N/A 

16 

0 

3.01 

127.7% 

39.61% 

4.33% 

4.34% 

4.25% 

4.34% 

4.41% 

16 

8 

27.09 

5.10% 

2.20% 

0.28% 

0.27% 

0.27% 

0.26% 

0.26% 

16 

16 

51.18 

0.31% 

0.12% 

0.02% 

0.02% 

0.02% 

0.02% 

0.02% 

16 

24 

75.26 

0.022% 

0.0056% 

0.0010% 

0.0010% 

0.0010% 

0.0010% 

0.0010% 

16 

32 

99.34 

0.001% 

0.0005% 

0.0001% 

0.0001% 

0.0001% 

0.0001% 

0.0001% 


Table 11. Error in FFT with No Fault Tolerance and RPR Fault Tolerance. 
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Figure 45. Relative Error in FFT Representative Output for RPR (Nfft = 8), 


The multiple X-axis tiek labels represent the faet that the error in the FFT output 
was symmetrie, e.g., the error in X[3] and X[13] for the 16-point FFT had the same 
magnitude. 



Figure 46. Relative Error in FFT Representative Output for RPR (Nfft = 16). 

























Overall, the data gathered from the FFT experiment shows a marked increase in 
reliable precision of the output - and therefore improvement in accuracy - with higher 
degrees of RPR. Although the smallest tested degree of RPR (r = 8) still allowed an 
average of five percent relative error in the principal FFT output values, the next larger 
degree of RPR (r = 16) allowed less than one-quarter of one percent maximum error in 
any result. In every case, any degree of RPR afforded orders of magnitude less error in 
all final output values than the error in the results obtained using the unprotected systems. 

Also, the relationship between relative error and degree of RPR is comparable 
across different sizes of FFT. Before implementing a very large FFT, additional study 
should be done to confirm that the precision of the output is maintained over many (e.g., 
10) levels of butterfly operations. However, if the operations are implemented with 
guard bits as discussed in Chapter III, the precision should be preserved. 

Ultimately the trade between the space savings of small degrees of RPR and the 
more significant error allowed by small degrees of RPR is a consideration that must be 
evaluated by a system designer. 

D, GENERAL NOTES ON RPR-PROTECTED SYSTEM PERFORMANCE 

From the simulations in this chapter several points can be made. The effect of a 
transient error due to a data memory fault is far less damaging than the effect of a 
persistent error due to a configuration memory fault. Even in a worst-case scenario, 16 
bits or less of precision in an RPR result for an overall system n = 52 (i.e., RPR degree 
16/52 or smaller) provides the accuracy necessary to maintain tight control in an ADCS 
or less than 0.2% error in FFT output elements. The speed with which a recursive data 
management system (like an ADCS) operates has a significant impact on the precision 
required of an acceptable RPR result. Overall, the performance of a system improves 
with larger degrees of RPR - but the fundamental benefit of RPR increases with smaller 
degrees of RPR, so any designer must evaluate carefully the trade space bounded by 
FPGA logic capacity, operation speed, and lowest tolerated precision. 
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V. APPLYING RPR TO A COMPLEX SYSTEM 



A, THE NPS SPACECRAFT SIMULATOR 

The Bifocal Relay Mirror Spacecraft (BRMS) simulator (BRMSS) is an 
experimental test bed developed at the Naval Postgraduate School and used for ground 
testing of spacecraft adaptive control algorithms [32], The BRMSS is based on a 
structure of circular platforms supported by a spherical air bearing that allows it to rotate 
freely about three axes. The top platform of the simulator contains the spacecraft payload 
apparatus; between the middle and bottom platforms are the spacecraft computers, 
sensors, and actuators. Mounted around the edge of the simulator are three flexible 
appendages, each linked to the main body by a single torsional spring. Three Control 
Moment Gyroscope (CMG) actuators provide torque for the rotational motion of the 
spacecraft (Figure 47). 


Figure 47. NPS Bifocal Relay Mirror Spacecraft Simulator (After [32]), 
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The appendages are meant to simulate the motion of solar arrays or antennas, 
which are often flexible components attached to the central body of a spacecraft. 

Before new algorithms are applied to the test bed hardware, they are developed in 
a software simulation that contains models of the simulator dynamics, the simulated 
kinematics of the spacecraft, the controller, the allocation of control to the actuators, and 
the commanded new attitude and rate for the simulator (Figure 48). In the following 
sections, the dynamics and control blocks of this model are examined and improved. At 
the conclusion of this chapter is a brief analysis from the perspective of implementation 
on a reprogrammable computer, and suitable locations for applying RPR are noted. 


EuIm Angt** 



Figure 48, BRMS Simulator System Model (Copy of Figure 32), 


B, DYNAMICS OF THE BRMS SIMULATOR 


1, Current Model: Rigid-Body Dynamics 


Currently the BRMSS is modeled as a central, rigid body with no flexible 
appendages and body-centered, body-fixed coordinate frame OXYZ (Figure 49). There is 
non-linear coupling among the three axes of rotation of the simulator. The moment of 
inertia (MOI) matrix for the rigid-body BRMSS in its body-centric reference frame 

(j is represented by 


B 


Jb 


J J 

XX xy 

J J 

yx yy 

zx zy 


J. 

Jy. 

J,. 


( 21 ) 
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with principal axes {X, Y, Z) components and cross-coupling terms 

{jyx-'Jxz-'Jzx-'Jyz-'J thc ccntol body reference frame. The values for each 


inertia matrix element are determined experimentally in [32] to be 


J = 


130.34 

3.02 

10.52 


3.01 

174.64 

-0.40 


10.52 

-0.40 

181.23 


( 22 ) 


Equation (22) shows that the rigid-body principal axis terms dominate the overall system 
MOI (i.e., the principal axis terms are much larger than the cross-coupling terms). 
Therefore, for the purposes of this approximate model the cross-coupling terms of the 
rigid-body MOI are neglected, leaving 



'Jxx 

0 

0 " 


"130.34 

0 

0 

J = 

0 

Jyy 

0 

= 

0 

174.64 

0 


0 

0 

J^z_ 


0 

0 

181.23 


(23) 



Figure 49, Rigid-body model of BRMSS, showing principal body axes. 
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The simplified equations of motion (EOM) of the rigid body BRMSS in matrix 
form are 



0 

1 

o 





0 


0 

• 


> = ' 

Ty 

0 

0 



/ - 

_ 


T.. 


(24) 


where are the components of the control torque applied by the actuators (the 


CMGs) in the direction of each of the principal inertia axes of the rigid-body model. 


2, Modeling the Flexible Appendages 


There are three identical flexible appendages attached to the central body of the 
BRMSS. Each appendage consists of a long reinforced (rigid) bar with a mass on each 
end; the bar is connected to the main simulator body by a torsional spring at the center of 
the bar (Eigure 50). The rigid bar has thickness a, length b, width c and mass M. Each 
end mass has radius r and mass m. The center of each end mass is located a 
perpendicular distance R from the torsional spring. The torsional spring has spring 
constant k. 
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Figure 50, BRMSS Flexible Appendage* Model (Not To Scale), 
*Local coordinate system (oxyz) shown is nsed for appendages 1 and 2 only. 
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The principal moments of inertia of the appendage in its local body frame are 
calculated using simple geometry and the parallel axis theorem. Using the parameters 
defined in Figure 50, the principal moments of inertia of a single appendage in its local 
frame of reference (oxyz) are 

Each torsional spring is attached to the BRMSS main structure at a point in the XT 
(horizontal) plane of the central body (Figure 51). Appendages 1 and 2 are oriented such 
that the axes about which the appendages rotate lie in the central body XY (horizontal) 
plane, displaced from the central body X axis by the angles a and P, respectively. 
Appendage 3 is oriented such that its local axis of rotation is parallel to the central body Z 
axis (i.e., the appendage sweeps out local angle y, shown in Figure 50, in the central body 
XY plane). The vectors J. from the origin of the central body O to the attachment point 
of each torsional spring are determined to be 

= S'; cosaX + iSj sin«Y + 0Z = S'; ^cos«X + sin«Yj, 0 > « > -y 

= ^2 cosy^X + S'j siny0Y + OZ = 5*2 ^cosy0X + siny0Yj, -^> l5>-n (26) 

""53 =0X + 53Y + 0Z = 53Y 
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Figure 51, BRMSS Appendage Attachment Points and Orientation (Top View), 


In order to determine the effect of the flexible appendages on the dynamics of the 
BRMSS, the MOI of each appendage must be expressed in the central body reference 
frame. For appendages 1 and 2 there are two transformations: from the central body 
frame B to the torsional spring frame S, and from the spring frame S to the local 
appendage body frame A. The first transformation is a rotation about the central body Z 
(or B 3 ) axis by the angle a, represented by the third Euler axis rotation matrix [33], 

cos a sin a 0 

C'^' = -sin a cos a 0 (27) 

0 0 1 




where a is defined as the constant angle between the central body X axis and the vector 
drawn from the central body origin O to the spring attachment point for appendage 1 (see 

71 

Figure 51). In this scenario, the value of angle a is such that 0 > a > -y. The second 

transformation is a rotation of the appendage about the spring axis, which is coincident 
with the local x axis of the appendage. This is represented by the first Euler axis rotation 
matrix [33], 
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0 


(28) 


1 0 

= 0 cosfj sinfj 

0 -sinfj cos/j 

where the angle y\ is swept out by appendage 1 rotating about its local x axis. 

The complete transformation needed to express vector properties of appendage 1 
in the central body reference frame is the multiplication of these two rotation matrices, or 

cos a sin a cos sin a sin 

5^4 _ jj“5,^4 j _ cosacosfj cosasin^j (29) 

0 -sinfj cosfj 

By similarity, the transformation matrix needed to express vector properties of 
appendage 2 in the central body reference frame is 

cos P sin P cos sin (3 sin 
J = -siny? cosy^cosfj cosy^sin^j (30) 

0 “Sinfj cos/j 

where ji is defined as the constant angle between the central body X axis and the vector 
drawn from the central body origin O to the spring attachment point for appendage 2 (see 
Figure 51 and Equation (26)), and yi is the angle swept out by appendage 2 rotating about 
its local X axis. 

Appendage 3 is oriented differently: its local axis of rotation is aligned with the 
principal Z axis of the central body. Because of this alignment, it is more straightforward 
to redefine the local body axes for appendage 3 as depicted in Figure 52 (compare to 
local body axes in Figure 50 for appendages 1 and 2). Flsing this local reference frame, 
the single transformation required to express vector properties of appendage 3 in the 
central body reference frame is the third Euler rotation matrix [33], 

cos ^3 sin ^3 0 

= -smy3 cosy3 0 (31) 

0 0 1 

where 73 is the angle swept out by appendage 3 rotating about its local x axis. 
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Figure 52, Flexible Appendage - Local Body Axes Redefined for Appendage 3 Only, 


The principal moments of inertia for appendage 3 in the local appendage body 
frame, with its redefined local body axes, are re-ordered from Equation (25) to be 

JA,„=^M[a^+b^) + l[mR^) (32) 

The local MOI for each appendage can now be expressed in the central body 
reference frame j via the inertia transformation found in [33], 


where 





0 Ja„ 
0 0 


0 

0 


(33) 


(34) 


using ,J^ ,J^ from Equation (25) or Equation (32), for appendages 1 and 2 or 
appendage 3, respectively. 


In addition to the central-body MOI and the representation of the MOI of each 
appendage in central-body frame, the total BRMSS MOI takes into account the effect of 
the mass of each appendage on the central body as dictated by the parallel axis theorem. 

The parallel axis theorem applied to the MOI of the BRMSS appendages is 
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from [33], where 




( 35 ) 


0 -S^ Sy 

[ 5 ]= 5^ 0 -5;, (36) 

-Sy Sjy 0 

using the veetors 5 , as defined in Equation (26) for the spring attachment point of each 
appendage. 

The complete BRMSS MOI is equal to the sum of the original rigid-body MOI in 
its central body reference frame (Equation (23)) and the parallel axis effects of the mass 
of each flexible appendage (applying Equation (35)). The complete system MOI 
represented in compact form by 

[ ] = [ V, ] +1 ([ V„ ] - ]). (37) 

/=1 

In addition to the system MOI, the spring constant of the torsional spring in each 
appendage is necessary in order to find the equations of motion (EOM) for the BRMSS. 
The stiffness of the torsional spring for each flexible appendage is calculated using the 
natural frequency of the appendage (experimentally determined) /„, as in 


In In 


^ k = JA.ox{'^^fnf 

V AxoX 


where J^ is the moment of inertia of the appendage about its axis of rotation (for 


appendages 1 and 2, and for appendage 3). The variable is the natural frequency 


of the rotating body in radians, /„ is the natural frequency of the rotating body in Hertz, 
and the constant k is the calculated stiffness of the torsional spring. 


The EOM that completely describe the BRMSS depend on a set of six state 
variables: {, 6*^, 6*^, , 72 ’/s} • compact form, these equations are 

[ ]{4)+E *,({«.)-[ "c"' ]{ 2, }) = { ) 

i=i (39) 
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When neglecting cross-coupling as in Equation (23), the inertia matrices become 
diagonal. After applying the transformations and , the decoupled EOM can be 
separated and treated as three independent control problems in the three principal central- 
body axes. This process is shown in Equations (40) through (46). Equations (40) 


through (43) are the four EOM (the central body and each of the three appendages), in 


matrix form (three dimensions) with no cross-coupling effects among the dimensions. 
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(43) 


Equation sets (44) through (46) show the EOM rewritten as uncoupled equations 
from Equations (40) through (43), regrouped to address motion in the X, Y, and Z 
principal axes of the central body. 
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JbJb^ + ^ \[^B^ -cose ■y,) + [es -COSp-y^) + - o)] = 

JAX /i+k yi-^cosa -Oj^ -sin«-^^ j =0 
^Aiji+k y2-(cosp-ej, -smp-ej,^ =0 

‘^^3„(0) + ^ (0)-(cos/3-^^ -sin/3-^^ ) =(sin;-3-^^ -cos/3-^3j ) = 0 


'^bJb,^ +k [eg +sma-yx) + (eg +smp-y2Y -o) = 

JAX (0) + A: 0-^sin«-cosfj •6’^^+cos«-cos/j •6’g -smy^^-O^ =0 
Ja2 (0) + A: 0-^0.P■cosy2-OjA^+cosP■cosy2-0,2 =0 

^^^3^ (0) + ^ [o ■ (sin /3 • ^B. + cos /3 • e^y )] = 0 

“ 0 ) + (^B, “ 0 ) + (^S, “/b) -f3) = ^CMG 

J AX ( 0 ) + A: 0 -^sin«-sin^j • 6 ’^^+ cos«-sin^j • 6 * 3 ; +cosfj -^5 j ... 

= sin «• sin f j • Oj^x + cos a ■ sin f j • 6 *^ + cos ^ 1 - 6 * 3 ^ =0 

J a2 (0) + A: O-^siny0-sin/2+cosy0-sin/2 •6*3^ +cosf2-^i}) ••• 

= sin y 0 • sin ^2 • dgx + cos P ■ sin ^2 ’ + cos ^ 2 ' =0 

/^3 (/3) + A: /b ) “ *^.43^^73 ^(i^3 ) “ ii 


Equations (44) through (46) simplify to the following set of six EOM; 

Jjx +k (3^^ - cos a • - cos /? • ^ 2 ) = T, 

JyyOy +k (36'^ + sin a • + sin /? • ^^ 2 ) = Ty 
jA+k(3e^-y,) = T^ 

fy +A:(-cosa-61^ + sina-6>^ +;t'i) = 0 
^A2j2+k(-cos/3-0,+sm/3-0y+y2) = O 
'^A3f3+k[-0^+y^) = O 


105 



where are the principle moments of inertia of the combined BRMSS from 

Equation (37). This representation neglects cross-coupling terms in order to allow 
independent control of the BRMSS in each principal axis. These EOM describe the 
BRMSS system and may be used to design a controller that takes into account both the 
rigid-body dynamics and the flexible appendage motion. 

C. CONTROL OF THE BRMS SIMULATOR 


1. State-Space System Representation 


The BRMSS dynamics and control can be represented as a linear time-invariant 
(LTI) system with noise, 


x(t) = Ax(t) -I- Bu(t) + v(t) 
y{t) = C\{t) + Du(t) + w(t), 


(48) 


where x is the system state vector, y is the observable states (a subset of x), and u is the 
vector of controls. The matrix A is the system (or plant), B is the input matrix, C is the 
output matrix (that selects the states observable as output), and D is the feed-forward 
matrix (nominally set to [0]). The vectors v and w represent system noise: v is the noise 
due to model uncertainty, and w is the noise due to measurement uncertainty or sensor 
error. Although the true BRMSS system is time-varying (i.e., 
A = A{t), B = B{t), C = C(t), D = D(t) ), it may be represented as an LTI system by 
making two simplifications: neglecting the MOI cross-coupling of the BRMSS system 
and appendages, and eliminating the time-dependence of the system MOI (7 = 0 for Jxx, 
Jyy, J^z calculations only). The angles of the flexible appendages (y, in Equation (47)) are 
still allowed to vary. 


Converting the EOM of Equation (47) into state-space form gives the first system 
equation 
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x(^) = A • x(^) + B • u(0 + v(0 
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(49) 


In the BRMSS, the only directly observable states are the angular rates of the 
complete system about each principal axis The rates can be integrated to 

obtain the BRMSS system angles(6*^,6*^,6*^], but the system angles are not directly 

observed. This means that the vector of y observable states in the BRMSS system is 
described as 
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y{t) = C ■ x(t) + D • u(0 + w(0 

' 4 ' 

n 

^ 1 n 0 0 0 0 0 0 0 0 0 0 o 1 To o o 1 
^1=01 ooooooooooJ^i+oo 1 + 

d 001000000000 " 0 T 

"J L A q J [^CMgJ 

/2 
73, 

The noise values Vj...Vj 2 in Equation (49) and w^,Wj^,w^ in Equation (50) are 
random variables with zero-mean and constant covariance. The level of measurement 
noise w(t) is determined in part using precision characteristics published by the sensor 

manufacturers. The level of model uncertainty v(t) is based on the inaccuracies of the 
model, e.g., assuming linearity or time-invariance. 

Representing the BRMSS in state-space format facilitates the use of a more 
sophisticated controller for the system. A linear quadratic Gaussian controller enables 
both estimation of the unobserved states and compensation for the model uncertainty and 
measurement noise. 

2, Linear-Quadratic-Gaussian Controller 

The BRMSS system is currently operated using a PD controller that multiplies the 
measured error in central body rates 0^,6y,6^ and angles (determined through 

integration of the rates) by constant gains to calculate the required control torque. A 
more sophisticated controller is the linear quadratic regulator (EQR) controller, which 
minimizes a quadratic performance index, or cost, subject to constraints imposed by the 
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linear system representation of the BRMSS EOM. The LQR controller is a fundamental 
example of the application of Pontriyagin’s Minimum Principle to linear dynamical 
systems; for a thorough derivation see [34], Given the LTI system described in Equation 
(48), the goal is to generate an optimal feedback control gain F{t) that satisfies the linear 
relationship 

\x{t) = -F{t)x{t) (51) 

and minimizes the quadratic cost function 

J = - {T)F{T)x{T) + J (x^gx + u^Ru) dt (52) 

where Q and R are weighting matrices for the state and control vectors. The control gain 
matrix F is given by 

F = R^B^P (53) 

where P is the solution to the continuous time algebraic Riccati equation (CARE) 

A^P + PA-PBR^B^P + Q = 0. (54) 

Detailed derivations and discussion of numerical methods for solving these equations 
may be found in [18], [34] and [35]. 

An important requirement for using a linear quadratic regulator is that the system 
in question must have the full state available, i.e., all elements of the state vector must be 
observable. In the rigid-body BRMSS model, only three states are observable: the rates 
about the central body axes This generates significant uncertainty in the 

feedback system. In cases like the BRMSS where the system is not completely 
observable, the control design must include an estimator in addition to the EQR. The 
EQR and the estimator together are considered to be Linear-quadratic-Gaussian (LQG) 
control. An EQG controller combines an EQR with a Kalman filter to estimate the 
unobservable states. It also accounts for the effects of Gaussian noise added to the 
system due to model uncertainty and measurement imprecision (y. and w.). The EQG 
controller form for the system in Equation (48) that minimizes the cost function in 
equation (52) is 
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( 55 ) 


i(0 = A{t)i{t) + B{t)xx{t) +^(0 (y(0 - C(0i(0), i(0) = £ (x(0)) 
u{t) = -L{t)x{t) 

where K is the Kalman gain assoeiated with the Kalman filter used to estimate the 
unobservable states, and L is the feedbaek gain matrix. The funetion E represents 
expeetation (in this ease, of the state x at initial time t = 0). The Kalman gain K is equal 
to 

K{t) = P{t)C\t)W-\t) (56) 

where T’(t) is the solution estimator problem posed by the matrix Rieeati differential 
equations 

P{t) = A{t)P{t) + P{t)A^ (0 - P{t)C^ (t)C(t)P(t) + V(t), 

P(0) = e(x(0)x^(0)). 

Similarly, the feedbaek gain matrix L is equal to 

L(t) = p-'(t)B^(t)S(t) (58) 

where is the solution of the LQR problem expressed by the matrix Rieeati 

differential equations 

S(t) = A^ (t)S(t) + S(t)A(t) - SiOBiOP-^ it)B^ + Q{t), 

S{T) = F. 

The LQG eontroller expressed in Equation (55) is inserted into the forward path 
of a feedbaek loop with the plant deseribed by Equations (49) and (50) to generate the 
eontrolled system depieted in Eigure 53. Although both the BRMSS model and the EQG 
eontroller are expressed as state-spaee systems, the generie input u to the eontroller is in 
fact the computer observable state error (t), and the generic output y of the controller 
is the calculated system control torque u(t) that is also the input to the BRMSS dynamics. 
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Figure 53, Block diagram of state-space BRMSS system with LQG controller. 


One consequence of using an LQG controller is the presence of nonzero steady- 
state error in the solution. This error is eliminated by including a feedforward gain R 
in the path of the reference state that scales the reference command before it is used to 
compute the system error (^) • The feedforward gain is included in Figure 53. 

As implemented for the BRMSS in this research, the LQG controller requires six 
observable states as input: central-body rates and angles 6^,6^,,6^. This enables 

the system to respond to commanded angles as well as rates. In the previous PD 
controller model, the central body angles were obtained by integrating the directly- 
observable rates. The true BRMSS uses sun sensors (with a simulated sun) as well as 
rate gyros to determine attitude rates and pointing angles. Therefore, the assumption for 
the flexible-structure LQG controller model that the central-body position and rates are 
observable is valid. The error accumulated in the real system due to any dependence of 
the angles on the rate measurements is accounted for in the covariance of the noise values 
V. and w.. The six remaining states (flexible appendage oscillation rates 

angles Yy,y 2 ,Yi) are truly unobservable and are therefore estimated using the Kalman 
filter in the LQG calculations. 
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3. Demonstrating LQG Control of the Flexible System 


The enhancement of the BRMSS simulation was completed in phases. First the 
flexible structure model was developed to replace the rigid-body model and was tested 
with the classical (PD) controller; second an LQR controller was developed for the rigid- 
body model and tested in that environment. Then an LQR controller was generated for 
the flexible structure model, as if it were fully observable, to understand the nature of the 
optimized control for that system. Finally the LQG controller was developed for the 
flexible structure, and testing was completed on the fully updated simulation. 

In each scenario, the independent variables were the gains (in PD control) or the 
weighting matrices (in the LQ control). There were four metrics observed that 
determined “good enough” control; steady-state error within one percent, settling time 
less than twenty seconds, overshoot less than two percent, and minimal (if any) amplitude 
of oscillation in steady-state solution. The cost in each case was the control effort u{t), 

measured both by maximum value and by total effort about all axes |u(Q|(itj. 

a. Rigid-body model with classical control (original system) 

The original BRMSS system simulation used a rigid-body model with a 
classical PD controller operating on attitude quaternions (transformation from Euler 
angles) and rates (reference Figure 48). The arbitrary MOI for the rigid body was 
= (l5,15,25), aud the gains were = (10,10). Figure 54 depicts the 

attitude history of the system through a reference maneuver from (0° 0° 0°) to (1° 1° 1°). 
It shows that the settling time of the system is between 20 and 30 seconds, the overshoot 
is less than 20%, and there is no error (bias) or oscillation in the steady-state solution. 
Figure 55 shows the control history of the system through the same reference maneuver; 
it shows that the maximum control effort expended is approximately 0.09 N-m. The total 
control torque applied for the reference maneuver about all three axes is 25.3 N-m. 
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Figure 54, Attitude of Main Body (Rigid-Body Model with PD Control), 



Figure 55, Control history (Rigid-Body Model with PD Control), 
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After determining the response of the original system to the referenee 
maneuver, the first modification task was to replace the rigid-body BRMSS model with a 
new flexible structure model. 

b. Flexible structure model with classical control 

For the first system modification, the Euler dynamics model was replaced 
with the state-space model derived during this research for the flexible system. The 
flexible system has twelve total states with three observable states that are 

integrated to obtain the attitude angles Figure 56 and Figure 57 depict the 

attitude angles and control history, respectively, for this scenario. Figure 56 
demonstrates that the settling time, overshoot and steady-state accuracy are comparable 
to that of the rigid-body model. Figure 57 shows that the control required for this model 
is considerably higher than for the rigid-body model, but this is reasonable since the MOI 

for the flexible-structure model is on the order of (= (200,200,300), which 

is an order of magnitude more than the rigid-body model. The total control torque 
required to execute the reference maneuver for the flexible structure using the PD 
controller is 377.2 N-m. 



Figure 56, Attitude Angles (Flexible Structure with PD Control), 
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Figure 57, Control History (Flexible Structure with PD Control), 


The next step in the modification process was to generate a new LQR 
controller for the rigid-body system using optimization methods and test its operation. 

c. Rigid-body model with LQR control 

For this system, the rigid-body EOM were converted into state-space form 
to enable the creation of a LQR controller based on the state matrix A, control matrix B, 
and weighting matrices Q and R. Figure 58 shows that the LQR controller generates a 
smoother response than the PD controller for the rigid-body model; the system also 
settles faster with the LQR controller. Figure 59 shows that the maximum control 
expended with the LQR controller is less than for the PD controller (less than 0.6 N-m vs. 
0.1 N-m). The total control expended by the LQR controller to execute the reference 
maneuver with the rigid-body model is only 3.6 N-m - an improvement by a factor of six 
over the PD controller effort. 


115 
























Figure 58, Attitude Angles (Rigid-Body Model with LQR Control), 



Figure 59, Control History (Rigid-Body Model with LQR Control), 
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After demonstrating the eoncept of the LQ eontroller improving the rigid- 
body system response at lower cost, the next task was to generate a LQR controller for 
the flexible structure model. 

d. Flexible structure with LQR control 

Instead of proceeding directly to the LQG controller, which combines two 
new functions (LQR for the flexible structure and the Kalman filter to estimate unknown 
states), the next system tested was a simulation of the flexible structure in full-state form 
- i.e., with all states directly observable. This eliminated the need for an estimator and 
enabled the design of an LQR controller for the flexible system. 

Figure 60 shows the attitude angles through the reference maneuver for 
the LQR-controlled flexible structure. Although the LQR controller worked very 

quickly, it was necessary to increase the weighting on the main body angles 

by a factor of six in order to reduce the amplitude of a 0.01 Hz oscillation present in the 
system steady state. A byproduct of this was the very fast settling time of the system 
(under 10 seconds). Figure 61 shows the control history for this system - and indicates 
that the maximum torque about any axis was around 10 N-m. The total control for the 
maneuver was 315.4 N-m - less than for the classical controller, but still large because of 
the high gains required to dampen the steady-state oscillation. 
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Figure 60, Attitude Angles (Flexible Structure with LQR Control), 



Figure 61, Control History (Flexible Structure with LQR Control), 
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e. Flexible Structure with LQG control 

The final task in construction of the sophisticated model for the BRMSS 
was the generation of the LQG controller for the flexible-structure model. This controller 

required moderately high weighting factors on the angles similar to the 

requirement for the LQR controller, but the magnitude of the weighting factors did not 
need to be as high to achieve the same results. Figure 62 shows that the flexible-structure 
system response using the LQG controller reaches steady-state in under 10 seconds with 
less than 20% overshoot. It also has minimal steady-state error, with oscillations in the 
steady state reduced to negligible levels. Figure 63 depicts the control history for the 
LQG-controlled reference maneuver with the flexible-structure system; it shows that the 
maximum control torque in any axis is less than 6 N-m. The total control required for the 
reference maneuver using this system was 203.0 N-m. This is significantly lower cost 
than the PD or LQR-controlled systems. However, the cost is still high due to the 
presence of the steady-state 0.01 Hz oscillation in this system that must be controlled. 



Figure 62, Attitude Angles (Flexible Structure with LQG Control), 
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Figure 63, Control History (Flexible Structure with LQG Control), 


/, Summary 

A numeric summary and comparison of the results for each system is 
presented in Table 12. From this table, it is evident that in order to control the full 
flexible system at minimum cost, the linear-quadratic-Gaussian optimal control methods 
are desired. Further research in this area may focus on removing the oscillation entirely 
from the steady-state response of the flexible system. 


Case 

Gain/ Weight 
(rate, angle) 

Steady-State 
Error (°) 

0SS Oseill. 
Ampl. (°) 

Settling 
Time (s) 

Pereent 

Overshoot 

Control 

(Cost) 

a 

10, 10 

0 

0 

30 

12% 

25.3 

b 

100, 100 

0.01 

0.012 

30 

20% 

377.2 

c 

10, 10 

0 

0 

30 

3% 

3.6 

d 

400, 600 

5e-4 

0.006 

15 

11% 

315.4 

e 

100,2e+6 

le-4 

0.002 

10 

<5% 

203.0 


Table 12, BRMSS Control System Simulation Results 
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When used in a spacecraft for attitude control, any of these systems would 
be implemented in either the main flight computer or an ADCS co-processor. The ADCS 
co-processor in particular is an ideal candidate for using reprogrammable hardware. If 
some of the actuator or sensor hardware were to become inoperable or unusable, the 
ADCS co-processor could be reprogrammed with a new algorithm that optimizes use of 
the sensor data or control torque that is still available. If the requirements for the ADCS 
were to change, e.g., the operators were to attempt a more aggressive slewing maneuver 
that required more expensive control than that of the original system design, the ADCS 
could be similarly reprogrammed with an algorithm optimized to a new cost metric 
developed for the more aggressive mission. 

If the ADCS for a spacecraft is implemented in an FPGA, it will 
experience some level of SEU activity on orbit. Since spacecraft control involves multi¬ 
part systems (measurement, control, allocation) that require substantial memory in on¬ 
board processors, it is important to minimize the space and power required by the ADCS 
on its co-processor FPGA. In order to use the RPR techniques developed in this research 
with an ADCS like the BRMSS, it is necessary to associate the processes in the BRMSS 
control system with those previously identified operations that are most suitable for 
applying RPR. 

D. APPLYING RPR TO THE BRMSS CONTROL SYSTEM 

To apply RPR fault tolerance methods to the BRMSS control system, it is 
necessary to identify the fundamental operations in the control system processes. A 
simplified block diagram of the updated BRMSS control system (flexible structure model 
and LQG controller) is depicted in Figure 64. In order from left to right in Figure 64, the 
following nine processes occur, with associated computer operations: 

1. Receive commanded state (six-element vector) - input/memory access 

2. Convert degrees to radians - multiplication by a constant 

3. Modify command with feed-forward gain - multiplication by a constant 

4. Obtain state error (current/actual - commanded) - addition/sub traction 

5. Generate control commands based on current state error - matrix 
multiplication, multiplication by a constant, addition/sub traction 
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6. Apply control to system - output to actuators (allocator problem) 

7. Update system states - input from sensors (measurement problem) 

8. Convert radians to degrees - multiplication by a constant 

9. Report/store current state - output/memory aeeess 

10 . 



Figure 64, Updated BRMSS control system. 


The most suitable operations for RPR are the arithmetie operations - namely 
addition/subtraetion, multiplieation, and eombinations of these as they are found in 
matrix operations. The multiplications involving constants may be implemented sueh 
that the reduced-preeision operations seleet only a subset of the MSB of the eonstants as 
they are stored in memory. 

A designer of the RPR version of this ADCS would need to conduct additional 
performance trade studies before beginning implementation. One such trade is to 
compare speed and memory for storing eonstants. For example, the value for “ISO/tt” 
(radians-to-degrees) may be stored once with full precision, coded to guard against errors. 
It may also be stored in multiple loeations - with full and redueed preeision - to allow 
simultaneous aeeess by all copies of a multiplieation cireuit implemented using RPR. 

Trades between speed and memory space are only one category of complieations 
that still remain for implementing RPR in a eomplieated system. Another residual issue 
is how often to eorrect eonfiguration errors that have been deteeted; in a real-time system 
the aeeuraey of the eonfiguration is highly important, but eorreetion of the system 
eonfiguration (partieularly when the errors are not eausing signifieant degradation in 
control performance) eannot interfere with regular operation. This and other questions 
are enumerated as part of the eonelusion of this work. 
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VI. CONCLUSIONS AND RECOMMENDATIONS 


A. SUMMARY 

The harsh radiation environment of spaee generates faults in FPGAs that affeet 
both data and configuration memory. The Configurable Fault Tolerant Processor at the 
Naval Postgraduate School is a platform for testing methods of fault tolerance that guard 
against the single-event effects of radiation in FPGAs. In 2006 Snodgrass introduced a 
new method of fault tolerance, Reduced Precision Redundancy, as a power-saving 
alternative to traditional Triple Modular Redundancy. This research focused on the 
details of implementing RPR and the effect of RPR fault tolerance on the performance of 
spacecraft systems. 

Two categories of system architectures were discussed: recursive data 
management, found in feedback control systems; and flow-through data management, 
found in signal processing tools such as the fast Fourier transform. Examples of the two 
architectures were broken down into their elementary operations, and the common 
operations were chosen as the subjects of experiments in RPR implementation. The 
“degree of RPR” was defined as a measure of reduction in precision. Detailed RPR 
designs for addition/sub traction and multiplication were programmed, simulated and 
mapped to the Virtex™ XQVR600 FPGA using the Xilinx Integrated Software 
Environment. Versions of each operation were built in TMR and several degrees of RPR, 
and the EPGA resources required for each degree of RPR were compared to the resources 
used by the corresponding TMR experiments. The results obtained from the detailed 
designs were extrapolated to estimate the resources required to implement RPR division 
and the compound operations of matrix multiplication and the fast Eourier transform 
butterfly machine. 

An evaluation of RPR-protected system performance was conducted on models of 

recursive and flow-through data architecture systems using MATLAB and Simulink 

computational tools. Transient and persistent errors were modeled as delta and step 

functions of additive noise in the signal data flow, and RPR error correction was modeled 
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as an increase in signal-to-noise ratio whose magnitude depended on the degree of RPR. 
The improvement in system response and reduction in output error between “no fault 
toleranee” and “RPR fault tolerance” were measured to determine the impact of RPR on 
system performanee. 

One example system was also chosen to improve and assess as a complicated 
eandidate system for RPR. A new dynamics model was developed for the Bifoeal Relay 
Mirror Satellite Simulator testbed at NPS that described the effects of three flexible 
appendages, whieh expanded the model to a twelve-state system with limited observable 
output. A linear-quadratie Gaussian controller was developed for the flexible strueture 
model by combining a Kalman filter state estimator with a cost-optimal linear quadratic 
regulator (LQR). The new dynamies and control system was tested using a referenee 
maneuver, and control cost compared to the eost of using a simple PD controller. Finally, 
the enhanced BRMSS model was examined as a candidate system for RPR, and 
operations suitable for applying RPR were identified. 

B, CONCLUSIONS 

This research has shown that RPR is a viable fault tolerance approach for 
arithmetic operations. In order for RPR to be effeetive, the upper and lower bounds of 
the result must be generated in a certain manner depending on the operation being 
executed, paying partieular attention to the signs of the operands. Also, the RPR voter 
must be construeted sueh that it eonduets a numerical comparison of the MSB of the 
precise result with the bound results, as opposed to the bitwise comparison used in TMR. 

Experimental results show that for the simplest operations, RPR is not always the 
most efficient fault tolerance approach. The inereased eomplexity of an RPR voter over 
that of a TMR voter aetually causes an addition operation protected at a high degree of 
RPR to be larger than its corresponding TMR-protected operation. For RPR to provide 
notable FPGA space savings over TMR in the simplest of operations, the degree of RPR 
must be less than 0.25. The significance of this result is that the findings by Snodgrass 
(that RPR provides a 50%-70% FPGA area and power savings) are largely application- 
dependent: if RPR is applied externally, as in the CORDIC processor Snodgrass 
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developed, RPR provides a mueh greater area and power savings (over TMR) than if 
RPR is applied internally. Essentially, the benefit of RPR inereases with the eomplexity 
of the operation to whieh it is applied. 

System performanee simulations demonstrate that RPR provides very good 
reeovery from errors eaused by SEU in spaeeeraft systems. With a baseline preeision n 
of 52 bits, even an approximate RPR result with only eight bits of preeision drastieally 
improved the transient and steady-state response of an attitude eontrol system. Elsing the 
RPR result also redueed the relative error in frequeney output values of an PET to less 
than 0.2%. The performanee simulations also demonstrated that the bandwidth of a 
feedbaek eontrol system (dependent on proeessor speed and data I/O limitations) has 
signifieant impaet on its ability to rejeet noise of any kind, whieh in turn affeets the 
minimum aeeeptable degree of RPR for the system. This and other implementation 
eonsiderations eontribute to the design trade spaee of EPGA eapaeity and power, fault 
toleranee requirements and system performanee metries. 

The in-depth investigation of the dynamies and eontrol of a flexible spaee 
strueture illustrate that even a eomplex system ean be redueed to operations suitable for 
RPR. The time-dependent inertia, equations of motion and optimized linear-quadratie 
Gaussian eontroller developed are all different eombinations of the arithmetie operations 
diseussed in this work, interspersed with memory aeeess and non-numerieal logie 
funetions. If a system sueh as the BRMSS ADCS were implemented on an EPGA-based 
eomputer, it eould be made fault-tolerant using RPR. 

A final word about RPR gleaned from this researeh is this: a eombination of high- 
and low-level applieation appears to be the best use of RPR as a fault toleranee teehnique. 
Computing and propagating bounds from one low-level operation to the next allows good 
error eontrol in a sequenee of operations, whieh is neeessary in a large system. However, 
reserving the RPR voter implementation for only a few major points within the system - 
or at its final output - keeps the overall eost (in EPGA area) of using RPR low eompared 
to the eost of TMR. This eombination of internal and external RPR eould potentially be 
the most effieient RPR arehiteeture for many types of proeessors, but additional 

experimentation is required before that ean be postulated with eonfidenee. 
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C. RECOMMENDATIONS FOR FUTURE STUDY 

1, Investigating Internal vs. External RPR 

Previous research in the CFTP group at NPS explored both external and internal 
TMR. In external TMR voters are placed at the output points of a processor and have no 
access to the intermediate results within the processor. Internal TMR designs use voters 
after individual operations or sub-processes within the main processor, and check 
intermediate outputs so that errors are never propagated far within the processor. 

Snodgrass’s initial implementation of RPR in a CORDIC processor applied RPR 
externally: the entire CORDIC algorithm was implemented in VHDL with full and 
reduced precision, and any voting logic operated on the output of the processors. The 
operations described in chapter III of this work are the building blocks for internal RPR - 
the benefit and computational cost of applying RPR internally must be evaluated using a 
full processor architecture that implements RPR at the single operation and/or sub¬ 
process level. 

2. Fault Detection and Location Methods 

The difference between data and configuration faults in FPGAs, and how to 
identify each with certainty, has been discussed at length in this work. Traditionally 
configuration faults in an FPGA are identified and corrected by “scrubbing” the FPGA at 
regular intervals and reloading the configuration if any bits are found to be incorrect. In 
order to proceed in this area of work, a dedicated quantitative study is necessary to 
analyze the benefits and drawbacks of using error monitoring (1) to distinguish errors due 
to configuration faults from errors due to data faults, (2) as a method of triggering 
configuration scrubs, and (3) to locate the source of a configuration error on an FPGA. 
These functions should be explored for their potential to increase the efficiency of an 
FPGA, whether it is protected by TMR or RPR. 
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3. 


Standard Degrees of RPR 


In this thesis, arithmetic operations were implemented using combinations of 
n e (16,32,64) and r e (8,16,32). In practice, r is not limited to powers or two or even 

to multiples of two. Further study of more degrees of RPR is needed to generate a set of 
“standards” that may be used for applications with certain requirements on precision. 

4. Implementing RPR in Floating Point Representation 

All the VHDL or schematic FPGA design in this research was conducted using 
fixed-point numbers. The rules governing arithmetic and error accumulation are very 
different for IEEE (or other) floating-point representation - in fact, many “rules of 
algebra” are not even true for floating-point numbers [28]. Although an EPGA may be 
programmed using any numeric standard, interoperability with many general-purpose 
processors of today demands floating-point representation. The changes in behavior of 
RPR arithmetic operations when implemented using floating-point representation must be 
documented before attempting to build an RPR floating-point processor. 

5. Performance Evaluation Using Hardware 

The simulations in chapter IV suggest that the signal power level and power limits 
on the device or sensors used in a system have a significant effect on what degree of RPR 
is required to maintain control and minimize error in a system. The ADCS and/or PET 
experiments should be implemented using physical devices in order to confirm the 
relationships between signal power, equivalent noise power and degree of RPR. 

6. Comparing RPR to Other Fault-Tolerance Methods 

A concept introduced but not investigated in this research was that of the benefits 
and drawbacks of using error correction codes vs. redundancy for preserving data 
integrity during storage and transmission. Two approaches to this trade merit further 
study: using some form of RPR as protection for data storage or FPGA configuration, and 
comparing RPR to coding checks such as residue arithmetic for smaller common 
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operations. The complexity of the RPR voter is such that it can be very powerful for 
checking complicated processes, but its benefit must be examined very carefully when 
working with small circuits. A detailed study is warranted on the viability of using RPR 
over a simpler check like residue addition for simple operations. 
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APPENDIX A. RPR MODULE DESIGN 


A,1 REPRESENTATIVE RPR AND TMR ADDER OPERATION MODULES 



Figure 65, RPR 8/64 Addition - Operation Module, 
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Figure 66, TMR 64-bit Addition - Operation Module. 


130 
















































































































































A,2 REPRESENTATIVE RPR AND TMR ADDER VOTER MODULES 



Figure 67, RPR 8/64 Addition - Voter Module, 
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Figure 68, TMR 64-bit Addition - Voter Module, 
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A,3 REPRESENTATIVE RPR AND TMR MULTIPLIER OPERATION 
MODULES 
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Figure 69, RPR 16/64 Multiplication - Operation Module, 
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A,4 REPRESENTATIVE RPR AND TMR MULTIPLIER VOTER MODULES 



Figure 71, RPR 16/64 Multiplication - Voter Module with 2r-bit Comparators, 



Figure 72, TMR 64-bit Multiplication - Voter Module (Bitwise Majority), 
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APPENDIX B. COMPOUND OPERATION BOUND TESTING 


B,1 TESTING UPPER/LOWER BOUNDS ON MATRIX MULTIPLICATION 


%% MANUAL MATRIX MULT WITH BOUND CALCS AND COMPARISON 
clear all; close all; clc; format compact; format short 


sizem =5; % what size (square) matrix? 

scale = 100; % scale for determining error in bounds 


matl = 2*rand(sizem)-1 % generate initial values 

mate = 2*rand(sizem)-1 


mil = floor(scale*mat1)/scale % generate lower bounds 

m21 = floor(scale*mat2)/scale 


mlu 

m2u 

for 


= ceil(scale*mat1); 
= ceil(scale*mat2) ; 


i=l:sizem 
for j = 1: sizem 
if (mlu(i,j) 
mlu(i,j) 

end 

if (m2u(i,j) 
m2u(i, j) 

end 


== matl (i,j)) 

= mlu (i,j) + 1; 

== mat2 (i,j) ) 

= m2u(i, j) + 1; 


end 


end 

mlu = mlu/scale 
m2u = m2u/scale 


% generate upper bounds 

% make UPPER bound < LOWER bound 
% alter for integers 


prod = matl*mat2; 


% compute precise product 


pi = zeros(sizem); %matll*mat21; % INITIALIZE 
pu = zeros(sizem); %matlu*mat2u; 


for i=l:sizem 

for j=l:sizem 


%fill rows of bound product matrices 
%fill cols of bound product matrices 


for k=l:sizem % each term in the inner product 

if (matl(i,k)>0 && mat2(k,j)>0) % prod>0 THIS IS EQ 12 

putemp = mlu(i,k)*m2u(k, j) ; 
pltemp = mil(i,k)*m21(k, j) ; 
pu(i,j) = pu(i,j) + putemp; 
pl(i,j) = pl(i,j) + pltemp; 
else if (matl(i,k)<0 && mat2(k,j)>0) %prod<0 
putemp = mlu(i,k)*m21(k,j); 
pltemp = mil(i,k)*m2u(k, j) ; 
pu(i,j) = pu(i,j) + putemp; 
pl(i,j) = pl(i,j) + pltemp; 
else if (matl(i,k)>0 && mat2(k,j)<0) %prod<0 
putemp = mil(i,k)*m2u(k,j); 
pltemp = mlu(i,k)*m21(k, j) ; 
pu(i,j) = pu(i,j) + putemp; 
pl(i,j) = pl(i,j) + pltemp; 

else if (matl(i,k)<0 && mat2(k,j)<0) %prod>0 
putemp = mil(i,k)*m21(k,j); 
pltemp = mlu(i,k)*m2u(k,j); 
pu(i,j) = pu(i,j) + putemp; 
pl(i,j) = pl(i,j) + pltemp; 
end 

end 

end 
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end 


end 


end 


end 


checku = gt(prod,pu) %conduct bound checks (should all be 0) 

checkl = gt(pi,prod) 
checkb = gt(pl,pu) 

% PRINT ERROR MESSAGES, if applicable 

for i=l:sizem %fill rows of bound product matrices 

for j=l:sizem %fill cols of bound product matrices 

fprintf('\n (%d,%d): ',i,j); 

if ((checku(i,j) || checkl(i,j))&& checkb(i,j)) 

fprintf(' Bound error detected. Use PRECISE result.'); 
else if checku(i,j) || checkl(i,j) 

fprintf(' Error detected. Use REDUCED-PRECISION result.'); 
else fprintf(' No error detected. Use PRECISE result.'); 
end 


end 

end 


end 


B.2 CODE TESTING UPPER/LOWER BOUNDS ON EFT OPERATION 


%% MANUAL EFT WITH BOUNDS 

clear all; close all; clc; format compact; format short; 

% NOTE: must COMPUTE REAL and IMAG PARTS SEPARATELY (for bounds!) 

% RESERVE 'i' AS IMAGINARY SQRT(-l). 
scale = 100; 

N=16; % number of input points 

n=N; % number of output points (keep same as input for simplicity) 
w_inc = exp(-2*pi*sqrt(-1)/N) % define twiddle factor (increment) wn 
% Generate wn lookup table - FULL PRECISION 
for k=l:n 

for j=l:N 

w(k,j) = w_inc^((j-1)*(k-1) ) ; 

end 

end 

wr = real(w); 
wi = imag(w) ; 

%disp(w) 

% NOTE: w IS DIAGONAL — INEFFICIENT TO STORE WHOLE MATRIX... but don't 
% worry about that here!! 


% Generate w bound matrices (not sure if this is necessary...) 
wrl = floor(scale*wr)/scale % generate lower bounds 

wil = floor(scale*wi)/scale 


wru = ceil(scale*wr) 
wiu = ceil(scale*wi) 
for k=l:N 

for j = 1:N 

if (wru(k,j) 
wru(k,j) 

end 

if (wiu(k,j) 
wiu (k,j) 

end 

end 

end 

wru = wru/scale 
wiu = wiu/scale 


% generate upper bounds 


== wr(k,j)) % alter for integers 

= wru (k, j ) + 1; 

== wi(k,j)) 

= wiu(k,j) + 1; 


% Generate input vector and placeholder output vector 
xr= 2*rand(N,l) - 1; % generate input points (vector) 
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XXr = zeros(n,l); % initialize output (points) vector, reals 

XXi = zeros(n,l); % initialize output (points) vector, imags 

% Generate input bound vectors (bounds on input values) 

xrl = floor(scale*xr)/scale; % generate real input lower bounds 

xru = ceil(scale*xr); % generate real input upper bounds 

for k=l:N 

if (xru(k) == xr(k)) % increment upper bounds of int reals 

xru(k) = xru(k) + 1; 

end 

end 

xru = xru/scale; 

xi = zeros(N,l); % placeholders for imag parts 

% xil = xi; % (xi NOT NEEDED when samples x are all REAL) 

% xiu = xi; 

xr_in = [xrl xr xru] % check that xl < x < xu (SIGNED) 

% xi_in = [xil xi xiu] % PLACEHOLDER VECTORS ONLY for imag 

XXrl = zeros(n,l); % initialize output lower bounds (real) 

XXil = zeros (n,l); % initialize output lower bounds (imag) 

XXru = zeros(n,l); % initialize output upper bounds (real) 

XXiu = zeros(n,l); % initialize output upper bounds (imag) 

% Perform transform on exact values xr + xi*i (although all xi = 0 here) 

% (Equation from MATLAB Help on function "fft(x)") 
for k=l:n 

for j=l:N 

XXr(k) = XXr(k) + (xr(j)*wr(k,j) + xi(j)*wi(k,j)); % REAL parts 
XXi(k) = XXi(k) + (xr(j)*wi(k,j) + xi(j)*wi(k,j)); % IMAG parts 

end 

end 

XXX = fft(xr); % ACTUAL EFT for check/compare... 

compare_ReIm_Actual = [XXr XXi XXX] 

% (columns 1 and 2 should equal real/imag parts of col 3) 

% NOW APPLY TO BOUNDS 

% Create logical sign tests on x and w 

% NOTE: would also need x_imag IF the input set included complex samples, 
xrnn = logical(sign(xr)+1) % element is TRUE if corresp # is NONNEGATIVE 

wrnn = logical(sign(wr)+1) 
winn = logical(sign(wi)+1) 

% Write expressions for: XXrl XXru XXil XXiu 

% NOTE: if the input included complex samples, would need to use: 

% XXru(k) = XXru(k) + (xr(j)*wr(k,j) + xi(j)*wi(k,j)); % REAL parts 

% XXiu(k) = XXiu(k) + (xr(j)*wi(k,j) + xi(j)*wi(k,j)); % IMAG parts 

% XXrl(k) = XXrl(k) + (xr(j)*wr(k,j) + xi(j)*wi(k,j)); % REAL parts 

% XXil(k) = XXil(k) + (xr(j)*wi(k,j) + xi(j)*wi(k,j)); % IMAG parts 

% BUT as it stands, xi(j) = 0 for all j so we don't need the final term!! 
for k=l:n 

for j=l:N 

if (xrnn(j) && wrnn(k,j) && winn(k,j)) % OPERANDS ARE DIFF BOUNDS 

XXru(k) = XXru(k) + (xru(j)*wru(k,j)); % BASED ON SIGNS OF INPUT 

XXiu(k) = XXiu(k) + (xru(j)*wiu(k,j) ) ; % (Eq 12 again) 

XXrl(k) = XXrl(k) + (xrl(j)*wrl(k,j) ) ; 

XXil(k) =XXil(k) + (xrl(j)*wil(k, j) ) ; 

else if (xrnn(j) && wrnn(k,j) && ~winn(k,j)) 

XXru(k) = XXru(k) + (xru(j)*wru(k,j)); 

XXiu(k) = XXiu(k) + (xrl(j)*wiu(k,j)); 

XXrl(k) = XXrl(k) + (xrl(j)*wrl(k,j)); 

XXil(k) = XXil(k) + (xru(j)*wil(k, j) ) ; 

else if (xrnn(j) && ~wrnn(k,j) && winn(k,j)) 

XXru(k) = XXru(k) + (xrl(j)*wru(k,j)); 

XXiu(k) = XXiu(k) + (xru(j)*wiu(k,j)); 

XXrl(k) = XXrl(k) + (xru(j)*wrl(k,j)); 
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XXil(k) =XXil(k) + (xrl(j)*wil(k, j) ) ; 

else if (xrnn(j) && ~wrnn(k,j) && ~winn(k,j)) 

XXru{k) = XXru{k) + {xrl{j)*wru{k,j)); 

XXiu{k) = XXiu{k) + {xrl{j)*wiu{k,j)); 

XXrl{k) = XXrl{k) + {xru{j)*wrl{k,j)); 

XXil{k) = XXil{k) + {xru{j)*wil{k, j) ) ; 

else if (~xrnn(j) && wrnn(k,j) && winn(k,j)) 

XXru{k) = XXru{k) + {xru{j)*wrl{k,j)); 

XXiu{k) = XXiu{k) + {xru{j)*wil{k,j)); 

XXrl{k) = XXrl{k) + {xrl{j)*wru{k,j)); 

XXil{k) = XXil{k) + {xrl{j)*wiu{k, j) ) ; 

else if (~xrnn(j) && wrnn(k,j) && ~winn(k,j)) 

XXru{k) = XXru{k) + {xru{j)*wrl{k,j)); 

XXiu{k) = XXiu{k) + {xrl{j)*wil{k,j)); 

XXrl{k) = XXrl{k) + {xrl{j)*wru{k,j)); 

XXil{k) = XXil{k) + {xru{j)*wiu{k, j) ) ; 

else if (~xrnn(j) && ~wrnn(k,j) && winn(k,j)) 
XXru{k) = XXru{k) + {xrl{j)*wrl{k,j)); 

XXiu{k) = XXiu{k) + {xru{j)*wil{k,j)); 

XXrl{k) = XXrl{k) + {xru{j)*wru{k,j)); 

XXil{k) = XXil{k) + {xrl{j)*wiu{k,j) ) ; 

else if (~xrnn(j) && ~wrnn(k,j) && ~winn(k,j)) 
XXru{k) = XXru{k) + {xrl{j)*wrl{k, j) ) ; 

XXiu{k) = XXiu{k) + {xrl{j)*wil{k,j)); 

XXrl{k) = XXrl{k) + {xru{j)*wru{k,j)); 

XXil{k) = XXil{k) + {xru{j)*wiu{k,j) ) ; 

end 

end 

end 

end 

end 

end 

end 

end 

end 

end 

% compare_bounds_Real = [XXrl XXr XXru]; % should be coll<col2<col3 
% compare_bounds_Imag = [XXil XXi XXiu]; % (if you want to see the values) 

% Conduct checks of precise against bounds and bounds against each other 
checkRu = gt(XXr,XXru); 
checkRl = gt(XXrl,XXr); 
checkRb = gt(XXrl,XXru); 

ErrorsReal_UpLoBd = [checkRu checkRl checkRb] 
checklu = gt(XXi,XXiu); 
checkll = gt(XXil,XXi); 
checklb = gt(XXil,XXiu); 

ErrorsImag_UpLoBd = [checklu checkll checklb] 


% end of EFT execution with bound testing 
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APPENDIX C. SYSTEM SIMULATION CODE 


C.l ADCS ERROR INJECTION MODEL SCRIPT FOR TESTING 

%% Experiment with BRMS PD Control Model - Error Injection 
clear angle_hist ctrl_NOerr ctrl_witherr snr; 
close all; clc; format compact; format long; 

% Set simulation parameters 
time = 50; 

simstep = 0.025; %fixed-step size (called in Config Params - Solver) 

% Set error injection parameters 
errtime = 5; 

errtype =2; % 0 for NONE, 1 for spike (transient), 2 for step (config/persist) 

snr = [ 27 inf inf ]; % 3 dB for no fault-tol; 27 for r=8, 51 for r=16 

pin = .025; 

% Run simulation 

[t,x] = sim('main_errl',time) ; 

% Plot Data (don't forget to check figure TITLES for trials!!) 
close;clc 

% Plot both attitude and control in one plot - SUBPLOTS 
figure(13); 

subplot(2,1,1),plot(t,angle_hist(:,1),'-b',t,angle_hist(:,2),'—k',... 
t,angle_hist(:,3) , '-.k'); 
grid on 

h = findobj(gca,'type','line'); 
set (h, 'linewidth', 2) ; 

Xlabel('Time (s)','fontsize',16); 

ylabel('Angular Position (\phi,\theta,\psi) (\circ)','fontsize',16); 
title('RPR (\itr\rm = 8) Persistent Error in T_x - Angular Position',... 

'fontsize',24); 

% title('Unbounded Persistent Error in T_x - Angular Position (500 s)',... 

% 'fontsize',24); 

Ihl = legend('Roll (\phi)','Pitch (\theta)','Yaw (\psi)'); 

%ylim([0 1.3]); 

subplot(2,1,2),plot(t,ctrl_NOerr(:,1),'-b',t,ctrl_NOerr(:,2),'—k',... 
t,ctrl_NOerr(:,3),'-.k'); 
grid on 

h = findobj(gca, 'type', 'line') ; 
set (h, 'linewidth', 2) ; 

Xlabel('Time (s)','fontsize',16); 

ylabel('Control Command (T_x,T_y,T_z) (N-m)','fontsize',16); 

title('RPR (\itr\rm = 8) Persistent Error in T_x - Commanded Control',... 

'fontsize',24); 

% title('Unbounded Persistent Error in T_x - Commanded Control (500 s)',... 

% 'fontsize',24); 

lh2 = legend('T_x','T_y','T_z'); 

%ylim([0 1.3]); 


% Plot both attitude and control in one plot - OVERLAID AXES 
figure(14); 

title('Unbounded Persistent Error in T_x - Angular Position and Commanded 
Control','fontsize',24); 

%title('RPR (\itr\rm = 8) Persistent Error in T_x - Angular Position and Commanded 
Control','fontsize',24); 


hll = 

line(t,angle_hist (: 

,1), 

'Color' 

, 'b', 

. ' linestyle', 


hl2 = 

line(t,angle_hist(: 

,2) , 

'Color' 

, 'b', 

. 'linestyle', 


hl3 = 

line(t,angle_hist(: 

,3) , 

'Color' 

, 'b', 

. ' linestyle', 


grid 

on 






axl = 

gca; 







set(ax1, 'YColor', 'b') ; 
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Xlabel('Time (s)','fontsize',16); 

ylabel('Angular Position (\phi,\theta,\psi) (\circ)','fontsize',16); 

11 = legend{'Roll {\phi)'Pitch {\theta)','Yaw {\psi)','location','n'); 
ax2 = axes('Position',get(axl,'Position '), ... 

'YAxisLocation ', 'right',... 

'Color', 'none', ... 

'XColor', 'k', 'YColor', 'k') ; 

%ylim{[-.06 .08]); 

ylabel('Control Command (T_x,T_y,T_z) (N-m) ', 'fontsize', 16) ; 


hl4 

= line(t,ctrl_NOerr( 

,1), 

'Color' 

. 'k', 

. ' linestyle', 

' — ', 'Parent',ax2) 

hl5 

= line(t,ctrl_NOerr( 

,2) , 

'Color' 

. 'k', 

. 'linestyle', 

' -', 'Parent',ax2) ; 

hl6 

= line(t,ctrl_NOerr( 

,3) , 

'Color' 

. 'k', 

. ' linestyle', 

' -. ', 'Parent',ax2) 

12 = legend('T_x','T_y',' 
set (12, 'Color', 'white'); 

' T_z ' 

, 'location', 

. 's'); 


hal 

= findobj(gcf,'type', 

. 'axes'); 





ha2 = findobj(hal,'type','line'); 
set (ha2 , 'linewidth',2) ; 

% end of ADCS performance evaluation code 

C.2 FFT ERROR INJECTION MODEL SCRIPT FOR TESTING 

%% FFT: FLOATING-POINT WITH AWGN INJECTION 

clear all; close all; clc; format compact; format long 

% Set FFT parameters 
N = 16; % for now 

sets = 1000; % sim time, also num FFTs (batches?) calculated 

% Set error injection parameters 
errsig = 7; 

%errtrial = 1; 

%errtype =0; % 0 for spike (data/trans), 1 for step (config/persist) 

errvec = zeros(N,l); % initialize error selection vector 

errvec(errsig) =1; % specify which signal has error 

% Set noise parameters 
nseed = floor{1000*rand(N,1) ) ; 

snr =0; % 0 for no fault-tolerance, **tbD** for RPR fault-tol 

vccin = 2.5; %volts 

curin =10; % mA 

pin = vccin * 0.001*curin 

% NOTE: Vccin nominally 2.5 V @ < 100 mA for Xilinx 1 
% —> set power in "signal in" to P = I*V for noise! 

% Run simulation 

sim('newFFT_l6pt_snrl',sets); 

% note: EACH TRIAL is a COLUMN of data in output arrays. 

% Check (relative) error in input and output 

in_err = abs((fft_in_witherr - fft_in_NOerr)./ fft_in_NOerr)'; 
out_err = abs((fft_out_witherr - fft_out_NOerr)./fft_out_NOerr)'; 

in_erravg = mean(in_err,2) 
out_erravg = mean(out_err,2) 

% end of FFT error simulation 
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APPENDIX D. BRMSS CODE FOR DYNAMICS MODEL AND 

CONTROLLER 


D,1 EFFECT OF APPENDAGES ON SYSTEM MOI (J) 

clear all; close all; format compact; clc; 

% Use experimental rigid-body MOI from paper (J. J. Kim, NPS, 2008) 

Jb = [130.34 3.01 10.52; 3.02 174.64 -0.40; 10.52 -0.40 181.23]; 

%mrvec = [0.00196; 0.00481; 0.19695]; %center of gravity vector, m*r 

% Calculate moment of inertia of each appendage 

m = 22.6796; %bg, mass of each cyl. "end" (2 per bar/appendage, 501b) 
%NOTE: each cyl end mass is modeled as a POINT MASS @ R 
r = 0.15; %m, radius of end cylinder {CHECK THIS??) 

R = 0.727; %m, radius from center of bar to CENTER of end (point) mass 

%NOTE: applies to all 3 appendages (any orient'n) 

%Iml = m^R'^C; %kg-m'"2, MOI of one end mass (PARALLEL AXIS THM) 

%Im2 = Iml; %kg-m'^2, MOI of other end mass 

M = 11.3398; %hg, mass of bar (1 per appendage, 251b) 
a = 0.1; %m, width of bar (in plane of rotation, any orient'n) 

b = 1.27; %m, length of bar (connecting two end masses, < 2R) 

c = 0.02; %m, EST thickness of bar (thin dim) 

Ma = M + 2*m; %kg, total mass of one appendage (any 1,2,3) 

% Principle moments of inertia of appendages in their local body axes 
lalxx = (1/12) *M* {a"'2+b"'2)+2*m*R"'2; 
lalyy = (1/12) *M* {a'"2 + c"2)+2*0.5*m*r'"2; 
lalzz = (1/12) *M* {b'"2 + G'"2)+2*m*R'"2; 

Jal = diag([lalxx lalyy lalzz]); Jalpr = diag(Jal); 

Ja2 = Jal; Ja2pr = Jalpr; 

Ja3 = diag([lalyy lalzz lalxx]); Ja3pr = diag(Ja3); 

% NOTE: Now calculate torsional SPRING CONSTANT for appendages 
% fn = sqrt(kspring/MOIapp) for rotational bodies 

fn = 0.1; %Hz, tuned natural frequency of each appendage (from J.J. Kim) 
k = fn''2*Ialxx; %N-m/rad, torsional spring constant for each app. 

% Spring constant is the same for all; x-axis for 1/2 & z-axis for 3. 

% Coordinate transformations using rotation matrices from body to apps 
AL = -pi/4; %rad, angle between body X and appendage 1 
BE = -3*pi/4; %rad, angle between body X and appendage 2 
%(NOTE: appendage 3 is ALIGNED with body Y axis, so no angle is needed) 


gl = 

0; 

%rad. 

angle 

position 

of 

appendage 

1 w. r. t. 

springl 

X 

axis 

g2 = 

0; 

%rad. 

angle 

position 

of 

appendage 

2 w. r. t. 

spring2 

X 

axis 

g3 = 

0; 

%rad. 

angle 

position 

of 

appendage 

3 w.r.t. 

spring3 

X 

axis 


% Rotation matrices for Appendage 1 
bCsl = [ cos(AL) sin{AL) 0; 

-sin(AL) cos(AL) 0; 

0 0 1 ] ; 
slCal = [1 0 0; 

0 cos(gl) sin(gl); 

0 -sin(gl) cos(gl); ]; 
bCal = bCsl*slCal; 

% Rotation matrices for Appendage 2 
bCs2 = [ cos (BE) sin(BE) 0; 

-sin(BE) cos(BE) 0; 

0 0 1 ] ; 
s2Ca2 = [1 0 0; 

0 cos{g2) sin {g2); 

0 -sin{g2) cos{g2); ]; 
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bCa2 = bCs2*s2Ca2; 

% Rotation matrices for Appendage 3 
bCa3 = [ cos{g3) sin{g3) 0; 

-sin(g3) cos(g3) 0; 

0 0 1 ] ; 

% {ONLY ONE ROTATION NEEDED for appendage 3) 


% Expressing each appendage's MOT 
bJ_al = bCal*Jal*bCal'; %coord 

bJ_a2 = bCa2*Ja2*bCa2'; %coord 

bJ_a3 = bCa3*Ja3*bCa3'; %coord 


in CENTRAL 
txfmation 
txfmation 
txfmation 


BODY FRAME 
(Likins eq 
(Likins eq 
(Likins eq 


8.35a, 
8.35a, 
8.35a, 


p.427) 

p.427) 

p.427) 


% Vectors from central body origin 0 to each appendage (spring) 


Rb = 0.7 62; %r 

Ral = 0.28575; %r 

Ra2 = Ral; %r 

Ra3 = 0.36195; %r 

51 = Rb + Ral; 

52 = Rb + Ra2; 

53 = Rb + Ra3; 
sal = [Sl*cos(AL) 


radius of central body 
extension of appendage 


1 arm beyond rigid body radius 


extension of appendage 2 arm beyond rigid body radius 
extension of appendage 3 arm beyond rigid body radius 


Sl*sin(AL) 0] 


sa2 = 

[S2*cos(BE) S2*sin(BE) 0]'; 

sa3 = 

[ 0 

S3 

0] 

% Contribution 

of each 

appendage to MOI of total BRMSS system 

% (build skew- 

symmetric 

matrices) 

stl = 

[ 0 

-sal(3) 

sal (2) ; 


sal (3) 

0 

-sal (1) ; 


-sal (2) 

sal (1) 

0 ] ; 

st2 = 

[ 0 

-sa2(3) 

sa2 (2) ; 


sa2(3) 

0 

-sa2 (1) ; 


-sa2(2) 

sa2 (1) 

0 ] ; 

st3 = 

[ 0 

-sa3(3) 

sa3 (2) ; 


sa3 (3) 

0 

-sa3 (1) ; 


-sa3 (2) 

sa3 (1) 

0 ] ; 

bJbal 

= bJ_al 

- Ma*stl* 

stl; %kg-m''2 (all three bJba(n)) 

bJba2 

= bJ_a2 

- Ma*stl* 

St2; % YES it's '-'!! —>EX 8.4.9 p 

bJba3 

= bJ_a3 

- Ma*stl* 

st3; % fig 8.5 p 424 & eq 8.28!! 


% (TOTAL MOT is larger than bJ_a(n), ^despite* :) ) 

REMEMBER: if you do NOT assume gamma = 0, these are all time-varying! 


% TOTAL BRMSS MOT: 

Jtot = Jb + bJbal + bJba2 + bJba3; %time-invariant *IF* GAMMA(n) = 0. 
Jpr = diag(Jtot); 

Jd = diag(Jpr); % drop cross-terms to get just PRINCIPLE AXES on DIAG. 

%JdInv = inv(Jd); % inverse of J... for pre-mult for EOM? 

Jgs = Ialxx*eye(3); 

Jeom = [ Jd zeros(3); 

zeros(3) Jgs ]; 

Jsys = [ Jeom zeros(6); 

zeros(6) eye(6) ] ; 

Jinv = inv(Jsys); % inversion of MOI matrix for EOM calcs 


D.2 STATE-SPACE SYSTEM DESCRIPTION/CONTROL USING (J) 

% NOTE: See ch V for EOM and steps to simplify them in order to get A,B!! 

% State-Space Model of Linear Dynamics System WITH NOISE: 

% xdot{t) = A{t)*x{t) + B{t)u{t) + v{t) 

% y (t) = C{t) *x{t) + w{t) 

% where x is the state vector, y are the observable states, u is the 
% control, V is the model uncertainty (noise), w is the measurement 
% uncertainty (noise), A is the system/plant, B the control distribution, 

% and C is the selectrion matrix that pulls observable states from x. 
n=12; q=3; p=3; % dim of x, y, u 
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% qx qy qz gl g2 g3 ] 

Acoeffs = -k*[ 300 -cos(AL) -cos(BE) 0 ; 

030 sin{AL) sin{BE) 0 ; 

0 0 3 0 0 -1 ; 

-cos(AL) sin{AL) 01 0 0 ; 

-cos(BE) sin(BE) 00 1 0 ; 

0 0 -10 0 1 ]; 

% '-k' because they are going on the other side of the equation now. 
zeta = 0.0001; 
wn = 2*pi*fn; 

Adamp= -2*zeta*wn*[000 000; 

0 0 0 0 0 0 ; 

0 0 0 0 0 0 ; 

0 0 0 1 0 0 ; 

0 0 0 0 1 0 ; 

0 0 0 0 0 1 ] ; 

Atemp = [ Adamp Acoeffs; eye(n/2) zeros(n/2) ]; 

A = Jinv*Atemp; 

Btemp = [ eye(p); zeros(n-p,p) ]; 

B = Jinv*Btemp; 

C = [eye(q) zeros(q,n-q) ] ; 

D = zeros(p); 


% Asub = [ -3*k/Jpr(l) 0 0 k*cos(AL)/Jpr(1) k*cos(BE)/Jpr(1) 0 ; 

% 0 -3*k/Jpr(2) 0 -k*sin(AL)/Jpr(2) -k*sin(BE)/Jpr(2) 0 ; 

% 00 -3*k/Jpr(3) 0 0 k/Jpr(3) ; 

% k*cos(AL)/Jalpr(1) -k*sin(AL)/Jalpr(1) 0 -k/Jalpr(l) 0 0 ; 

% k*cos(BE)/Ja2pr(1) -k*sin(BE)/Ja2pr(1) 0 0 -k/Ja2pr(l) 0 ; 

% 0 0 k/Ja3pr(3) 0 0 -k/Ja3pr(3) ]; 

% zeta = 0.0001; 

% wn = 2*pi*fn; 

% Adam = 2*zeta*wn*[zeros(3) zeros(3); 

% zeros(3) eye(3) ]; 

% %from Ch.V, Eq 32 (# for now) 

% Bsub = [1/Jpr(l) 0 0; 0 1/Jpr(2) 0; 0 0 1/Jpr(3) ]; 

% A = [Adam Asub; eye(n/2) zeros(n/2)]; 

% B = [Bsub; zeros(n-p,p)]; 

% C = [eye(q) zeros(q, n-q) ] ; 

% D = zeros(p); 

% Ts = []; 


%brmss = ss(A,B,C,D,Ts); % ESTABLISHES THE STATE-SPACE SYSTEM 'brmss’ 

% NOTE: 'Ts' dictates DISCRETE TIME MODEL, step unspecified. CHECK THIS!! 
brmss = ss(A,B,C,D) % CONTINUOUS TIME state-space system 'brmss' 

% Now refer to brmss.a, b, c, d in the plant/system dynamics subsystem of model 

% Check controllability of the system 

% (if rank of controllability matrix = rank of system, it IS CONTROLLABLE) 
wantzero = rank(A)-rank(ctrb(brmss)) 

% Generate noise and weighting matrices 
QXU = eye(15); % PLACEHOLDER 

QWV = eye(15); % PLACEHOLDER 

LQGreg = Iqg(brmss,QXU,QWV) 

% Now refer to LQGreg.a, b, c, d in the controller in the Simulink model 
% end of file as of November 2008 

D.3 SYSTEM AND CONTROLLER A,B,C,D MATRICES (OUTPUT) 

a = 

xl x2 x3 x4 x5 x6 

x7 x8 
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0.004451 

0 







x2 

0 

0 

0 

0 


0 

0 

o 

1 

o 

003383 







x3 

0 

0 

0 

0 


0 

0 

0 

0 







x4 

0 

0 

0 

-4.927e-006 


0 

0 
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1 
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0 
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-4.927e-006 

0 
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0 

0 
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0 

0 

0 
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0 
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0 

0 

0 
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0 

0 
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0 

0 
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1 
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0 
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0 0 
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x3 
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o 

1 
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0 

0 
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0 

0 
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0 

0 
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0 
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xl x2 
yl 1 0 

y2 0 1 

y3 0 0 


x3 x4 

0 0 

0 0 

1 0 


x5 x6 x7 x8 

0 0 0 0 

0 0 0 0 

0 0 0 0 


x9 xlO xll xl2 

0 0 0 0 

0 0 0 0 

0 0 0 0 


d = 

ul u2 u3 
yl 0 0 0 

y2 0 0 0 

y3 0 0 0 

Continuous-time model, 
wantzero = 

0 
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a = 



0) 

1 

I—1 

X 

x2_e 

x3_e 

x4_e 

x5_e 

x6_e 

x7_e 

xl_e 

-1.155 

-7.076e-017 

-3.34e-017 

-0.01741 

0.01741 

-3.253e-017 

-0.01425 

x2_e 

-6.557e-017 
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1 

X 
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0 
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0 

-0.01 

0 

0 


0) 

1 

LO 

X 

0.007071 

0 

0 
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0) 
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X 
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0) 
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X 

-3.491e-016 
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4.351e-014 

8.727 



= 

xl_e 

x2_e 

x3_e 

x4_e x5_e 

x6_e 
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ul 

-1.131e-015 

0.4888 

-0.4888 

4.384e-016 

u2 

-4.162e-015 

0.5338 
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7.485e-016 

u3 

-1.77 

2.306e-015 

1.912e-015 

0.7771 


d = 

yl y2 y3 
ul 0 0 0 

u2 0 0 0 

u3 0 0 0 

Input groups: 

Name Channels 

Measurement 1,2,3 

Output groups: 

Name Channels 

Controls 1,2,3 

Continuous-time model. 


148 



LIST OF REFERENCES 


[1] “Digital Hardware,” class notes for EC4530, Department of Electrical and 
Computer Engineering, Naval Postgraduate School, Summer 2008. 

[2] “Space Radiation Effects,” class notes for SS3035, Space Systems Academic 
Group, Naval Postgraduate School, Spring 2007. 

[3] J. Snodgrass, “Eow-Power Eault Tolerance Eor Spacecraft EPGA-Based 
Numerical Computing,” Ph.D. dissertation. Naval Postgraduate School, 

Monterey, CA, 2006. 

[4] A. C. Tribbel, D. J. Gomey, J.B. Blake, H.C. Koons, M. Schulz, A.E.Vampola, R. 
E. Walterschied, J. R. Wertz, “The Space Environment,” in Space Mission 
Analysis and Design, 3’^‘* Edition, J. R. Wertz and W. J. Earson, Ed. El Segundo: 
Microcosm Press, 1999, pp. 203-221. 

[5] P. Nordin and M. K. Kong, “Hardness and Survivability Requirements,” in Space 
Mission Analysis and Design, 3’^‘* Edition, J. R. Wertz and W. J. Larson, Ed. El 
Segundo: Microcosm Press, 1999, pp. 221-238. 

[6] R. W. Hamming, “Error Detecting and Correcting Codes,” in Bell System 
TechnicalJournal, 1950, pp. 147-160. 

[7] S. B. Wicker, Error Control Coding for Digital Communication and Storage. 
Englewood Cliffs, NJ: Prentice-Hall, 1995. 

[8] “Triple Modular Redundancy,” class notes for EC4810, Department of Electrical 
and Computer Engineering, Naval Postgraduate School, Summer 2007. 

[9] J. Coudeyras, “Radiation Testing of the Configurable Eault Tolerant Processor 
(CETP) Eor Space-based Applications,” M.S. thesis. Naval Postgraduate School, 
Monterey, CA, 2005. 

[10] D. Ebert, “Design and Development of a Configurable Eault-Tolerant Processor 
(CETP) for Space Applications,” M.S. thesis. Naval Postgraduate School, 
Monterey, CA, 2003. 

[11] P. Majewicz, “Implementation of a Configurable Eault-Tolerant Processor 
(CETP) Using Internal Triple Modular Redundancy (TMR),” M.S. thesis. Naval 
Postgraduate School, Monterey, CA, 2005. 


149 



[12] G. Caldwell, “Implementation of Configurable Fault Tolerant Processor (CFTP) 
Experiments,” M.S. thesis. Naval Postgraduate School, Monterey, CA, 2006. 

[13] B. N. Agrawal, Design of Geosynchronous Spacecraft. Englewood Cliffs, NJ: 
Prentice-Hall, 1986. 

[14] J. S. Etemo, “Attitude Determination and Control,” in Space Mission Analysis 
and Design, 3rd Edition, J. R. Wertz and W. J. Earson, Ed. El Segundo: 
Microcosm Press, 1999, pp. 354-380. 

[15] J. P.B. Vreeburg, “Spacecraft Manuevers and Slosh Control,” IEEE Control 
Systems Magazine, pp. 12-16, June 2005. 

[16] T. E. Suttles and R. E. Beverly, “Model for solar torque effects on DSCS II,” 
Journal of Astronautical Sciences, vol. 24, pp. 165-184, 1976. 

[17] C. B. Spence, Jr., “Environmental Torques,” in Spacecraft Attitude Determination 
and Control, J. R. Wertz, Ed. Springer, 1978, pp. 566-583. 

[18] I. M. Ross, Control and Optimization; An Introduction to Principles and 
Applications, Electronic Edition. Monterey, CA; Naval Postgraduate School, 
2005. 

[19] “Design of PID Control Eaw for Rotation About a Single Axis for a Rigid Body 
Spacecraft,” class notes for AE3818, Department of Mechanical and 
Astronautical Engineering, Naval Postgraduate School, Pall 2007. 

[20] J. H. Reed, Software Radio; A Modem Approach to Radio Engineering. Upper 
Saddle River, NJ; Prentice-Hall, 2002. 

[21] R. Cristi, Modern Digital Signal Processing. Pacific Grove; Brooks/Cole, 2004. 

[22] A. V. Oppenheim , R. W. Schafer, Digital Signal Processing, Prentice-Hall, 1975. 

[23] D. A. Wright, “Pield-Programmable Gate Array-Based Software Defined Radio,” 
M.S. thesis. Naval Postgraduate School, Monterey, CA, 2008. 

[24] Xilinx, Inc., “Past Pourier Transform v4.1,” Xilinx® EogiCore Product 
Specification DS260, April 2, 2007. 

[25] Xilinx, Inc., “Virtex™ 2.5 V Pield Programmable Gate Arrays Architectural 
Description,” Xilinx Product Specification DS003-2 (v2.8.1), December 9, 2002. 


150 



[26] J. H. Wilkinson, Rounding Errors in Algebraic Processes. Englewood Cliffs, 

New Jersey: Prentice-Hall, © 1963 British Crown. 

[27] R. K. Richards, Arithmetic Operations in Digital Computers. Princeton, New 
Jersey: D. Van Nostrand Company, Inc., 1955. 

[28] B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs. New 
York: Oxford University Press, 2000. 

[29] “Pipeline Division,” class notes for EC4830, Department of Electrical and 
Computer Engineering, Naval Postgraduate School, Spring 2008. 

[30] Xilinx, Inc., “Virtex™ 2.5 V Eield Programmable Gate Arrays Electrical 
Characteristics,” Xilinx Product Specification DS003-3 (v3.2), September 10, 
2002 . 

[31] J. J. Kim (private communication), 2008. 

[32] J. J. Kim and B.N. Agrawal, “Automatic Mass Balancing of Air-Bearing Based 
Three-Axis Rotational Spacecraft Simulator,” Naval Postgraduate School, 
Monterey, CA, October 2008. 

[33] P. W. Eikins, Elements of Engineering Mechanics. New York: Me Graw-Hill, 
1973. 

[34] J. E. Junkins, Y. Kim, Introduction to Dynamics and Control of Elexible 
Structures. Washington, D.C.: American Institute of Aeronautics and 
Astronautics, Inc., 1993. 

[35] R. E. Stengel, Optimal Control and Estimation. Mineola, New York: Dover, 1994. 


151 



THIS PAGE INTENTIONALLY LEET BLANK 


152 



INITIAL DISTRIBUTION LIST 


1. Defense Teehnieal Information Center 
Ft Belvoir, Virginia 

2. Dudley Knox Library 
Naval Postgraduate Sehool 
Monterey, California 

3. Chairman, MAE Department, Millsaps 
Naval Postgraduate Sehool 
Monterey, California 

4. Chairman, ECE Department, Knorr 
Naval Postgraduate Sehool 
Monterey, California 

5. Professor Brij Agrawal 
Naval Postgraduate Sehool 
Monterey, California 

6. Professor Hersehel H. Loomis, Jr. 
Naval Postgraduate Sehool 
Monterey, California 

7. Professor Alan A. Ross, Jr. 

Naval Postgraduate School 
Monterey, California 

8. Ms. Donna Miller 

Naval Postgraduate School 
Monterey, California 

9. AFIT/CIP 

Air Force Institute of Technology 
Wright-Patterson Air Force Base, Ohio 


153 



